public class SparkLauncher extends AbstractLauncher<SparkLauncher>
Use this class to start Spark applications programmatically. The class uses a builder pattern to allow clients to configure the Spark application and launch it as a child process.
Modifier and Type | Field and Description |
---|---|
static String |
CHILD_CONNECTION_TIMEOUT
Maximum time (in ms) to wait for a child process to connect back to the launcher server
when using @link{#start()}.
|
static String |
CHILD_PROCESS_LOGGER_NAME
Logger name to use when launching a child process.
|
static String |
DEPLOY_MODE
The Spark deploy mode.
|
static String |
DEPRECATED_CHILD_CONNECTION_TIMEOUT
Deprecated.
use `CHILD_CONNECTION_TIMEOUT`
|
static String |
DRIVER_DEFAULT_JAVA_OPTIONS
Configuration key for the default driver VM options.
|
static String |
DRIVER_EXTRA_CLASSPATH
Configuration key for the driver class path.
|
static String |
DRIVER_EXTRA_JAVA_OPTIONS
Configuration key for the driver VM options.
|
static String |
DRIVER_EXTRA_LIBRARY_PATH
Configuration key for the driver native library path.
|
static String |
DRIVER_MEMORY
Configuration key for the driver memory.
|
static String |
EXECUTOR_CORES
Configuration key for the number of executor CPU cores.
|
static String |
EXECUTOR_DEFAULT_JAVA_OPTIONS
Configuration key for the default executor VM options.
|
static String |
EXECUTOR_EXTRA_CLASSPATH
Configuration key for the executor class path.
|
static String |
EXECUTOR_EXTRA_JAVA_OPTIONS
Configuration key for the executor VM options.
|
static String |
EXECUTOR_EXTRA_LIBRARY_PATH
Configuration key for the executor native library path.
|
static String |
EXECUTOR_MEMORY
Configuration key for the executor memory.
|
static String |
NO_RESOURCE
A special value for the resource that tells Spark to not try to process the app resource as a
file.
|
static String |
SPARK_LOCAL_REMOTE |
static String |
SPARK_MASTER
The Spark master.
|
static String |
SPARK_REMOTE
The Spark remote.
|
Constructor and Description |
---|
SparkLauncher() |
SparkLauncher(java.util.Map<String,String> env)
Creates a launcher that will set the given environment variables in the child.
|
Modifier and Type | Method and Description |
---|---|
SparkLauncher |
addAppArgs(String... args)
Adds command line arguments for the application.
|
SparkLauncher |
addFile(String file)
Adds a file to be submitted with the application.
|
SparkLauncher |
addJar(String jar)
Adds a jar file to be submitted with the application.
|
SparkLauncher |
addPyFile(String file)
Adds a python file / zip / egg to be submitted with the application.
|
SparkLauncher |
addSparkArg(String arg)
Adds a no-value argument to the Spark invocation.
|
SparkLauncher |
addSparkArg(String name,
String value)
Adds an argument with a value to the Spark invocation.
|
SparkLauncher |
directory(java.io.File dir)
Sets the working directory of spark-submit.
|
Process |
launch()
Launches a sub-process that will start the configured Spark application.
|
SparkLauncher |
redirectError()
Specifies that stderr in spark-submit should be redirected to stdout.
|
SparkLauncher |
redirectError(java.io.File errFile)
Redirects error output to the specified File.
|
SparkLauncher |
redirectError(ProcessBuilder.Redirect to)
Redirects error output to the specified Redirect.
|
SparkLauncher |
redirectOutput(java.io.File outFile)
Redirects error output to the specified File.
|
SparkLauncher |
redirectOutput(ProcessBuilder.Redirect to)
Redirects standard output to the specified Redirect.
|
SparkLauncher |
redirectToLog(String loggerName)
Sets all output to be logged and redirected to a logger with the specified name.
|
SparkLauncher |
setAppName(String appName)
Set the application name.
|
SparkLauncher |
setAppResource(String resource)
Set the main application resource.
|
SparkLauncher |
setConf(String key,
String value)
Set a single configuration value for the application.
|
static void |
setConfig(String name,
String value)
Set a configuration value for the launcher library.
|
SparkLauncher |
setDeployMode(String mode)
Set the deploy mode for the application.
|
SparkLauncher |
setJavaHome(String javaHome)
Set a custom JAVA_HOME for launching the Spark application.
|
SparkLauncher |
setMainClass(String mainClass)
Sets the application class name for Java/Scala applications.
|
SparkLauncher |
setMaster(String master)
Set the Spark master for the application.
|
SparkLauncher |
setPropertiesFile(String path)
Set a custom properties file with Spark configuration for the application.
|
SparkLauncher |
setSparkHome(String sparkHome)
Set a custom Spark installation location for the application.
|
SparkLauncher |
setVerbose(boolean verbose)
Enables verbose reporting for SparkSubmit.
|
SparkAppHandle |
startApplication(SparkAppHandle.Listener... listeners)
Starts a Spark application.
|
setRemote
public static final String SPARK_MASTER
public static final String SPARK_REMOTE
public static final String SPARK_LOCAL_REMOTE
public static final String DEPLOY_MODE
public static final String DRIVER_MEMORY
public static final String DRIVER_EXTRA_CLASSPATH
public static final String DRIVER_DEFAULT_JAVA_OPTIONS
public static final String DRIVER_EXTRA_JAVA_OPTIONS
public static final String DRIVER_EXTRA_LIBRARY_PATH
public static final String EXECUTOR_MEMORY
public static final String EXECUTOR_EXTRA_CLASSPATH
public static final String EXECUTOR_DEFAULT_JAVA_OPTIONS
public static final String EXECUTOR_EXTRA_JAVA_OPTIONS
public static final String EXECUTOR_EXTRA_LIBRARY_PATH
public static final String EXECUTOR_CORES
public static final String CHILD_PROCESS_LOGGER_NAME
public static final String NO_RESOURCE
public static final String DEPRECATED_CHILD_CONNECTION_TIMEOUT
public static final String CHILD_CONNECTION_TIMEOUT
public SparkLauncher()
public SparkLauncher(java.util.Map<String,String> env)
env
- Environment variables to set.public static void setConfig(String name, String value)
name
- Config name.value
- Config value.public SparkLauncher setJavaHome(String javaHome)
javaHome
- Path to the JAVA_HOME to use.public SparkLauncher setSparkHome(String sparkHome)
sparkHome
- Path to the Spark installation to use.public SparkLauncher directory(java.io.File dir)
dir
- The directory to set as spark-submit's working directory.public SparkLauncher redirectError()
public SparkLauncher redirectError(ProcessBuilder.Redirect to)
to
- The method of redirection.public SparkLauncher redirectOutput(ProcessBuilder.Redirect to)
to
- The method of redirection.public SparkLauncher redirectError(java.io.File errFile)
errFile
- The file to which stderr is written.public SparkLauncher redirectOutput(java.io.File outFile)
outFile
- The file to which stdout is written.public SparkLauncher redirectToLog(String loggerName)
loggerName
- The name of the logger to log stdout and stderr.public SparkLauncher setPropertiesFile(String path)
AbstractLauncher
setPropertiesFile
in class AbstractLauncher<SparkLauncher>
path
- Path to custom properties file to use.public SparkLauncher setConf(String key, String value)
AbstractLauncher
setConf
in class AbstractLauncher<SparkLauncher>
key
- Configuration key.value
- The value to use.public SparkLauncher setAppName(String appName)
AbstractLauncher
setAppName
in class AbstractLauncher<SparkLauncher>
appName
- Application name.public SparkLauncher setMaster(String master)
AbstractLauncher
setMaster
in class AbstractLauncher<SparkLauncher>
master
- Spark master.public SparkLauncher setDeployMode(String mode)
AbstractLauncher
setDeployMode
in class AbstractLauncher<SparkLauncher>
mode
- Deploy mode.public SparkLauncher setAppResource(String resource)
AbstractLauncher
setAppResource
in class AbstractLauncher<SparkLauncher>
resource
- Path to the main application resource.public SparkLauncher setMainClass(String mainClass)
AbstractLauncher
setMainClass
in class AbstractLauncher<SparkLauncher>
mainClass
- Application's main class.public SparkLauncher addSparkArg(String arg)
AbstractLauncher
Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
addSparkArg
in class AbstractLauncher<SparkLauncher>
arg
- Argument to add.public SparkLauncher addSparkArg(String name, String value)
AbstractLauncher
It is safe to add arguments modified by other methods in this class (such as
AbstractLauncher.setMaster(String)
- the last invocation will be the one to take effect.
Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
addSparkArg
in class AbstractLauncher<SparkLauncher>
name
- Name of argument to add.value
- Value of the argument.public SparkLauncher addAppArgs(String... args)
AbstractLauncher
addAppArgs
in class AbstractLauncher<SparkLauncher>
args
- Arguments to pass to the application's main class.public SparkLauncher addJar(String jar)
AbstractLauncher
addJar
in class AbstractLauncher<SparkLauncher>
jar
- Path to the jar file.public SparkLauncher addFile(String file)
AbstractLauncher
addFile
in class AbstractLauncher<SparkLauncher>
file
- Path to the file.public SparkLauncher addPyFile(String file)
AbstractLauncher
addPyFile
in class AbstractLauncher<SparkLauncher>
file
- Path to the file.public SparkLauncher setVerbose(boolean verbose)
AbstractLauncher
setVerbose
in class AbstractLauncher<SparkLauncher>
verbose
- Whether to enable verbose output.public Process launch() throws java.io.IOException
The startApplication(SparkAppHandle.Listener...)
method is preferred when launching
Spark, since it provides better control of the child application.
java.io.IOException
public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) throws java.io.IOException
Applications launched by this launcher run as child processes. The child's stdout and stderr
are merged and written to a logger (see java.util.logging
) only if redirection
has not otherwise been configured on this SparkLauncher
. The logger's name can be
defined by setting CHILD_PROCESS_LOGGER_NAME
in the app's configuration. If that
option is not set, the code will try to derive a name from the application's name or main
class / script file. If those cannot be determined, an internal, unique name will be used.
In all cases, the logger name will start with "org.apache.spark.launcher.app", to fit more
easily into the configuration of commonly-used logging systems.
startApplication
in class AbstractLauncher<SparkLauncher>
listeners
- Listeners to add to the handle before the app is launched.java.io.IOException
AbstractLauncher.startApplication(SparkAppHandle.Listener...)