Class SparkLauncher
Use this class to start Spark applications programmatically. The class uses a builder pattern to allow clients to configure the Spark application and launch it as a child process.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.static final String
Logger name to use when launching a child process.static final String
The Spark deploy mode.static final String
Deprecated.use `CHILD_CONNECTION_TIMEOUT`static final String
Configuration key for the driver default extra class path.static final String
static final String
Configuration key for the default driver VM options.static final String
Configuration key for the driver class path.static final String
Configuration key for the driver VM options.static final String
Configuration key for the driver native library path.static final String
Configuration key for the driver memory.static final String
Configuration key for the number of executor CPU cores.static final String
Configuration key for the executor default extra class path.static final String
static final String
Configuration key for the default executor VM options.static final String
Configuration key for the executor class path.static final String
Configuration key for the executor VM options.static final String
Configuration key for the executor native library path.static final String
Configuration key for the executor memory.static final String
A special value for the resource that tells Spark to not try to process the app resource as a file.static final String
static final String
The Spark master.static final String
The Spark remote. -
Constructor Summary
ConstructorDescriptionSparkLauncher
(Map<String, String> env) Creates a launcher that will set the given environment variables in the child. -
Method Summary
Modifier and TypeMethodDescriptionaddAppArgs
(String... args) Adds command line arguments for the application.Adds a file to be submitted with the application.Adds a jar file to be submitted with the application.Adds a python file / zip / egg to be submitted with the application.addSparkArg
(String arg) Adds a no-value argument to the Spark invocation.addSparkArg
(String name, String value) Adds an argument with a value to the Spark invocation.Sets the working directory of spark-submit.launch()
Launches a sub-process that will start the configured Spark application.Specifies that stderr in spark-submit should be redirected to stdout.redirectError
(File errFile) Redirects error output to the specified File.Redirects error output to the specified Redirect.redirectOutput
(File outFile) Redirects error output to the specified File.Redirects standard output to the specified Redirect.redirectToLog
(String loggerName) Sets all output to be logged and redirected to a logger with the specified name.setAppName
(String appName) Set the application name.setAppResource
(String resource) Set the main application resource.Set a single configuration value for the application.static void
Set a configuration value for the launcher library.setDeployMode
(String mode) Set the deploy mode for the application.setJavaHome
(String javaHome) Set a custom JAVA_HOME for launching the Spark application.setMainClass
(String mainClass) Sets the application class name for Java/Scala applications.Set the Spark master for the application.setPropertiesFile
(String path) Set a custom properties file with Spark configuration for the application.setSparkHome
(String sparkHome) Set a custom Spark installation location for the application.setVerbose
(boolean verbose) Enables verbose reporting for SparkSubmit.startApplication
(SparkAppHandle.Listener... listeners) Starts a Spark application.Methods inherited from class org.apache.spark.launcher.AbstractLauncher
setRemote
-
Field Details
-
SPARK_MASTER
The Spark master.- See Also:
-
SPARK_REMOTE
The Spark remote.- See Also:
-
SPARK_LOCAL_REMOTE
- See Also:
-
DEPLOY_MODE
The Spark deploy mode.- See Also:
-
DRIVER_MEMORY
Configuration key for the driver memory.- See Also:
-
DRIVER_DEFAULT_EXTRA_CLASS_PATH
Configuration key for the driver default extra class path.- See Also:
-
DRIVER_DEFAULT_EXTRA_CLASS_PATH_VALUE
- See Also:
-
DRIVER_EXTRA_CLASSPATH
Configuration key for the driver class path.- See Also:
-
DRIVER_DEFAULT_JAVA_OPTIONS
Configuration key for the default driver VM options.- See Also:
-
DRIVER_EXTRA_JAVA_OPTIONS
Configuration key for the driver VM options.- See Also:
-
DRIVER_EXTRA_LIBRARY_PATH
Configuration key for the driver native library path.- See Also:
-
EXECUTOR_MEMORY
Configuration key for the executor memory.- See Also:
-
EXECUTOR_DEFAULT_EXTRA_CLASS_PATH
Configuration key for the executor default extra class path.- See Also:
-
EXECUTOR_DEFAULT_EXTRA_CLASS_PATH_VALUE
- See Also:
-
EXECUTOR_EXTRA_CLASSPATH
Configuration key for the executor class path.- See Also:
-
EXECUTOR_DEFAULT_JAVA_OPTIONS
Configuration key for the default executor VM options.- See Also:
-
EXECUTOR_EXTRA_JAVA_OPTIONS
Configuration key for the executor VM options.- See Also:
-
EXECUTOR_EXTRA_LIBRARY_PATH
Configuration key for the executor native library path.- See Also:
-
EXECUTOR_CORES
Configuration key for the number of executor CPU cores.- See Also:
-
CHILD_PROCESS_LOGGER_NAME
Logger name to use when launching a child process.- See Also:
-
NO_RESOURCE
A special value for the resource that tells Spark to not try to process the app resource as a file. This is useful when the class being executed is added to the application using other means - for example, by adding jars using the package download feature.- See Also:
-
DEPRECATED_CHILD_CONNECTION_TIMEOUT
Deprecated.use `CHILD_CONNECTION_TIMEOUT`Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.- Since:
- 1.6.0
- See Also:
-
CHILD_CONNECTION_TIMEOUT
Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.- See Also:
-
-
Constructor Details
-
SparkLauncher
public SparkLauncher() -
SparkLauncher
Creates a launcher that will set the given environment variables in the child.- Parameters:
env
- Environment variables to set.
-
-
Method Details
-
setConfig
Set a configuration value for the launcher library. These config values do not affect the launched application, but rather the behavior of the launcher library itself when managing applications.- Parameters:
name
- Config name.value
- Config value.- Since:
- 1.6.0
-
setJavaHome
Set a custom JAVA_HOME for launching the Spark application.- Parameters:
javaHome
- Path to the JAVA_HOME to use.- Returns:
- This launcher.
-
setSparkHome
Set a custom Spark installation location for the application.- Parameters:
sparkHome
- Path to the Spark installation to use.- Returns:
- This launcher.
-
directory
Sets the working directory of spark-submit.- Parameters:
dir
- The directory to set as spark-submit's working directory.- Returns:
- This launcher.
-
redirectError
Specifies that stderr in spark-submit should be redirected to stdout.- Returns:
- This launcher.
-
redirectError
Redirects error output to the specified Redirect.- Parameters:
to
- The method of redirection.- Returns:
- This launcher.
-
redirectOutput
Redirects standard output to the specified Redirect.- Parameters:
to
- The method of redirection.- Returns:
- This launcher.
-
redirectError
Redirects error output to the specified File.- Parameters:
errFile
- The file to which stderr is written.- Returns:
- This launcher.
-
redirectOutput
Redirects error output to the specified File.- Parameters:
outFile
- The file to which stdout is written.- Returns:
- This launcher.
-
redirectToLog
Sets all output to be logged and redirected to a logger with the specified name.- Parameters:
loggerName
- The name of the logger to log stdout and stderr.- Returns:
- This launcher.
-
setPropertiesFile
Description copied from class:AbstractLauncher
Set a custom properties file with Spark configuration for the application.- Overrides:
setPropertiesFile
in classAbstractLauncher<SparkLauncher>
- Parameters:
path
- Path to custom properties file to use.- Returns:
- This launcher.
-
setConf
Description copied from class:AbstractLauncher
Set a single configuration value for the application.- Overrides:
setConf
in classAbstractLauncher<SparkLauncher>
- Parameters:
key
- Configuration key.value
- The value to use.- Returns:
- This launcher.
-
setAppName
Description copied from class:AbstractLauncher
Set the application name.- Overrides:
setAppName
in classAbstractLauncher<SparkLauncher>
- Parameters:
appName
- Application name.- Returns:
- This launcher.
-
setMaster
Description copied from class:AbstractLauncher
Set the Spark master for the application.- Overrides:
setMaster
in classAbstractLauncher<SparkLauncher>
- Parameters:
master
- Spark master.- Returns:
- This launcher.
-
setDeployMode
Description copied from class:AbstractLauncher
Set the deploy mode for the application.- Overrides:
setDeployMode
in classAbstractLauncher<SparkLauncher>
- Parameters:
mode
- Deploy mode.- Returns:
- This launcher.
-
setAppResource
Description copied from class:AbstractLauncher
Set the main application resource. This should be the location of a jar file for Scala/Java applications, or a python script for PySpark applications.- Overrides:
setAppResource
in classAbstractLauncher<SparkLauncher>
- Parameters:
resource
- Path to the main application resource.- Returns:
- This launcher.
-
setMainClass
Description copied from class:AbstractLauncher
Sets the application class name for Java/Scala applications.- Overrides:
setMainClass
in classAbstractLauncher<SparkLauncher>
- Parameters:
mainClass
- Application's main class.- Returns:
- This launcher.
-
addSparkArg
Description copied from class:AbstractLauncher
Adds a no-value argument to the Spark invocation. If the argument is known, this method validates whether the argument is indeed a no-value argument, and throws an exception otherwise.Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
- Overrides:
addSparkArg
in classAbstractLauncher<SparkLauncher>
- Parameters:
arg
- Argument to add.- Returns:
- This launcher.
-
addSparkArg
Description copied from class:AbstractLauncher
Adds an argument with a value to the Spark invocation. If the argument name corresponds to a known argument, the code validates that the argument actually expects a value, and throws an exception otherwise.It is safe to add arguments modified by other methods in this class (such as
AbstractLauncher.setMaster(String)
- the last invocation will be the one to take effect.Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
- Overrides:
addSparkArg
in classAbstractLauncher<SparkLauncher>
- Parameters:
name
- Name of argument to add.value
- Value of the argument.- Returns:
- This launcher.
-
addAppArgs
Description copied from class:AbstractLauncher
Adds command line arguments for the application.- Overrides:
addAppArgs
in classAbstractLauncher<SparkLauncher>
- Parameters:
args
- Arguments to pass to the application's main class.- Returns:
- This launcher.
-
addJar
Description copied from class:AbstractLauncher
Adds a jar file to be submitted with the application.- Overrides:
addJar
in classAbstractLauncher<SparkLauncher>
- Parameters:
jar
- Path to the jar file.- Returns:
- This launcher.
-
addFile
Description copied from class:AbstractLauncher
Adds a file to be submitted with the application.- Overrides:
addFile
in classAbstractLauncher<SparkLauncher>
- Parameters:
file
- Path to the file.- Returns:
- This launcher.
-
addPyFile
Description copied from class:AbstractLauncher
Adds a python file / zip / egg to be submitted with the application.- Overrides:
addPyFile
in classAbstractLauncher<SparkLauncher>
- Parameters:
file
- Path to the file.- Returns:
- This launcher.
-
setVerbose
Description copied from class:AbstractLauncher
Enables verbose reporting for SparkSubmit.- Overrides:
setVerbose
in classAbstractLauncher<SparkLauncher>
- Parameters:
verbose
- Whether to enable verbose output.- Returns:
- This launcher.
-
launch
Launches a sub-process that will start the configured Spark application.The
startApplication(SparkAppHandle.Listener...)
method is preferred when launching Spark, since it provides better control of the child application.- Returns:
- A process handle for the Spark app.
- Throws:
IOException
-
startApplication
Starts a Spark application.Applications launched by this launcher run as child processes. The child's stdout and stderr are merged and written to a logger (see
java.util.logging
) only if redirection has not otherwise been configured on thisSparkLauncher
. The logger's name can be defined by settingCHILD_PROCESS_LOGGER_NAME
in the app's configuration. If that option is not set, the code will try to derive a name from the application's name or main class / script file. If those cannot be determined, an internal, unique name will be used. In all cases, the logger name will start with "org.apache.spark.launcher.app", to fit more easily into the configuration of commonly-used logging systems.- Specified by:
startApplication
in classAbstractLauncher<SparkLauncher>
- Parameters:
listeners
- Listeners to add to the handle before the app is launched.- Returns:
- A handle for the launched application.
- Throws:
IOException
- Since:
- 1.6.0
- See Also:
-