Class SparkLauncher

Object
org.apache.spark.launcher.AbstractLauncher<SparkLauncher>
org.apache.spark.launcher.SparkLauncher

public class SparkLauncher extends AbstractLauncher<SparkLauncher>
Launcher for Spark applications.

Use this class to start Spark applications programmatically. The class uses a builder pattern to allow clients to configure the Spark application and launch it as a child process.

  • Field Details

    • SPARK_MASTER

      public static final String SPARK_MASTER
      The Spark master.
      See Also:
    • SPARK_REMOTE

      public static final String SPARK_REMOTE
      The Spark remote.
      See Also:
    • SPARK_LOCAL_REMOTE

      public static final String SPARK_LOCAL_REMOTE
      See Also:
    • DEPLOY_MODE

      public static final String DEPLOY_MODE
      The Spark deploy mode.
      See Also:
    • DRIVER_MEMORY

      public static final String DRIVER_MEMORY
      Configuration key for the driver memory.
      See Also:
    • DRIVER_DEFAULT_EXTRA_CLASS_PATH

      public static final String DRIVER_DEFAULT_EXTRA_CLASS_PATH
      Configuration key for the driver default extra class path.
      See Also:
    • DRIVER_DEFAULT_EXTRA_CLASS_PATH_VALUE

      public static final String DRIVER_DEFAULT_EXTRA_CLASS_PATH_VALUE
      See Also:
    • DRIVER_EXTRA_CLASSPATH

      public static final String DRIVER_EXTRA_CLASSPATH
      Configuration key for the driver class path.
      See Also:
    • DRIVER_DEFAULT_JAVA_OPTIONS

      public static final String DRIVER_DEFAULT_JAVA_OPTIONS
      Configuration key for the default driver VM options.
      See Also:
    • DRIVER_EXTRA_JAVA_OPTIONS

      public static final String DRIVER_EXTRA_JAVA_OPTIONS
      Configuration key for the driver VM options.
      See Also:
    • DRIVER_EXTRA_LIBRARY_PATH

      public static final String DRIVER_EXTRA_LIBRARY_PATH
      Configuration key for the driver native library path.
      See Also:
    • EXECUTOR_MEMORY

      public static final String EXECUTOR_MEMORY
      Configuration key for the executor memory.
      See Also:
    • EXECUTOR_DEFAULT_EXTRA_CLASS_PATH

      public static final String EXECUTOR_DEFAULT_EXTRA_CLASS_PATH
      Configuration key for the executor default extra class path.
      See Also:
    • EXECUTOR_DEFAULT_EXTRA_CLASS_PATH_VALUE

      public static final String EXECUTOR_DEFAULT_EXTRA_CLASS_PATH_VALUE
      See Also:
    • EXECUTOR_EXTRA_CLASSPATH

      public static final String EXECUTOR_EXTRA_CLASSPATH
      Configuration key for the executor class path.
      See Also:
    • EXECUTOR_DEFAULT_JAVA_OPTIONS

      public static final String EXECUTOR_DEFAULT_JAVA_OPTIONS
      Configuration key for the default executor VM options.
      See Also:
    • EXECUTOR_EXTRA_JAVA_OPTIONS

      public static final String EXECUTOR_EXTRA_JAVA_OPTIONS
      Configuration key for the executor VM options.
      See Also:
    • EXECUTOR_EXTRA_LIBRARY_PATH

      public static final String EXECUTOR_EXTRA_LIBRARY_PATH
      Configuration key for the executor native library path.
      See Also:
    • EXECUTOR_CORES

      public static final String EXECUTOR_CORES
      Configuration key for the number of executor CPU cores.
      See Also:
    • CHILD_PROCESS_LOGGER_NAME

      public static final String CHILD_PROCESS_LOGGER_NAME
      Logger name to use when launching a child process.
      See Also:
    • NO_RESOURCE

      public static final String NO_RESOURCE
      A special value for the resource that tells Spark to not try to process the app resource as a file. This is useful when the class being executed is added to the application using other means - for example, by adding jars using the package download feature.
      See Also:
    • DEPRECATED_CHILD_CONNECTION_TIMEOUT

      @Deprecated(since="3.2.0") public static final String DEPRECATED_CHILD_CONNECTION_TIMEOUT
      Deprecated.
      use `CHILD_CONNECTION_TIMEOUT`
      Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.
      Since:
      1.6.0
      See Also:
    • CHILD_CONNECTION_TIMEOUT

      public static final String CHILD_CONNECTION_TIMEOUT
      Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.
      See Also:
  • Constructor Details

    • SparkLauncher

      public SparkLauncher()
    • SparkLauncher

      public SparkLauncher(Map<String,String> env)
      Creates a launcher that will set the given environment variables in the child.
      Parameters:
      env - Environment variables to set.
  • Method Details

    • setConfig

      public static void setConfig(String name, String value)
      Set a configuration value for the launcher library. These config values do not affect the launched application, but rather the behavior of the launcher library itself when managing applications.
      Parameters:
      name - Config name.
      value - Config value.
      Since:
      1.6.0
    • setJavaHome

      public SparkLauncher setJavaHome(String javaHome)
      Set a custom JAVA_HOME for launching the Spark application.
      Parameters:
      javaHome - Path to the JAVA_HOME to use.
      Returns:
      This launcher.
    • setSparkHome

      public SparkLauncher setSparkHome(String sparkHome)
      Set a custom Spark installation location for the application.
      Parameters:
      sparkHome - Path to the Spark installation to use.
      Returns:
      This launcher.
    • directory

      public SparkLauncher directory(File dir)
      Sets the working directory of spark-submit.
      Parameters:
      dir - The directory to set as spark-submit's working directory.
      Returns:
      This launcher.
    • redirectError

      public SparkLauncher redirectError()
      Specifies that stderr in spark-submit should be redirected to stdout.
      Returns:
      This launcher.
    • redirectError

      public SparkLauncher redirectError(ProcessBuilder.Redirect to)
      Redirects error output to the specified Redirect.
      Parameters:
      to - The method of redirection.
      Returns:
      This launcher.
    • redirectOutput

      public SparkLauncher redirectOutput(ProcessBuilder.Redirect to)
      Redirects standard output to the specified Redirect.
      Parameters:
      to - The method of redirection.
      Returns:
      This launcher.
    • redirectError

      public SparkLauncher redirectError(File errFile)
      Redirects error output to the specified File.
      Parameters:
      errFile - The file to which stderr is written.
      Returns:
      This launcher.
    • redirectOutput

      public SparkLauncher redirectOutput(File outFile)
      Redirects error output to the specified File.
      Parameters:
      outFile - The file to which stdout is written.
      Returns:
      This launcher.
    • redirectToLog

      public SparkLauncher redirectToLog(String loggerName)
      Sets all output to be logged and redirected to a logger with the specified name.
      Parameters:
      loggerName - The name of the logger to log stdout and stderr.
      Returns:
      This launcher.
    • setPropertiesFile

      public SparkLauncher setPropertiesFile(String path)
      Description copied from class: AbstractLauncher
      Set a custom properties file with Spark configuration for the application.
      Overrides:
      setPropertiesFile in class AbstractLauncher<SparkLauncher>
      Parameters:
      path - Path to custom properties file to use.
      Returns:
      This launcher.
    • setConf

      public SparkLauncher setConf(String key, String value)
      Description copied from class: AbstractLauncher
      Set a single configuration value for the application.
      Overrides:
      setConf in class AbstractLauncher<SparkLauncher>
      Parameters:
      key - Configuration key.
      value - The value to use.
      Returns:
      This launcher.
    • setAppName

      public SparkLauncher setAppName(String appName)
      Description copied from class: AbstractLauncher
      Set the application name.
      Overrides:
      setAppName in class AbstractLauncher<SparkLauncher>
      Parameters:
      appName - Application name.
      Returns:
      This launcher.
    • setMaster

      public SparkLauncher setMaster(String master)
      Description copied from class: AbstractLauncher
      Set the Spark master for the application.
      Overrides:
      setMaster in class AbstractLauncher<SparkLauncher>
      Parameters:
      master - Spark master.
      Returns:
      This launcher.
    • setDeployMode

      public SparkLauncher setDeployMode(String mode)
      Description copied from class: AbstractLauncher
      Set the deploy mode for the application.
      Overrides:
      setDeployMode in class AbstractLauncher<SparkLauncher>
      Parameters:
      mode - Deploy mode.
      Returns:
      This launcher.
    • setAppResource

      public SparkLauncher setAppResource(String resource)
      Description copied from class: AbstractLauncher
      Set the main application resource. This should be the location of a jar file for Scala/Java applications, or a python script for PySpark applications.
      Overrides:
      setAppResource in class AbstractLauncher<SparkLauncher>
      Parameters:
      resource - Path to the main application resource.
      Returns:
      This launcher.
    • setMainClass

      public SparkLauncher setMainClass(String mainClass)
      Description copied from class: AbstractLauncher
      Sets the application class name for Java/Scala applications.
      Overrides:
      setMainClass in class AbstractLauncher<SparkLauncher>
      Parameters:
      mainClass - Application's main class.
      Returns:
      This launcher.
    • addSparkArg

      public SparkLauncher addSparkArg(String arg)
      Description copied from class: AbstractLauncher
      Adds a no-value argument to the Spark invocation. If the argument is known, this method validates whether the argument is indeed a no-value argument, and throws an exception otherwise.

      Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.

      Overrides:
      addSparkArg in class AbstractLauncher<SparkLauncher>
      Parameters:
      arg - Argument to add.
      Returns:
      This launcher.
    • addSparkArg

      public SparkLauncher addSparkArg(String name, String value)
      Description copied from class: AbstractLauncher
      Adds an argument with a value to the Spark invocation. If the argument name corresponds to a known argument, the code validates that the argument actually expects a value, and throws an exception otherwise.

      It is safe to add arguments modified by other methods in this class (such as AbstractLauncher.setMaster(String) - the last invocation will be the one to take effect.

      Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.

      Overrides:
      addSparkArg in class AbstractLauncher<SparkLauncher>
      Parameters:
      name - Name of argument to add.
      value - Value of the argument.
      Returns:
      This launcher.
    • addAppArgs

      public SparkLauncher addAppArgs(String... args)
      Description copied from class: AbstractLauncher
      Adds command line arguments for the application.
      Overrides:
      addAppArgs in class AbstractLauncher<SparkLauncher>
      Parameters:
      args - Arguments to pass to the application's main class.
      Returns:
      This launcher.
    • addJar

      public SparkLauncher addJar(String jar)
      Description copied from class: AbstractLauncher
      Adds a jar file to be submitted with the application.
      Overrides:
      addJar in class AbstractLauncher<SparkLauncher>
      Parameters:
      jar - Path to the jar file.
      Returns:
      This launcher.
    • addFile

      public SparkLauncher addFile(String file)
      Description copied from class: AbstractLauncher
      Adds a file to be submitted with the application.
      Overrides:
      addFile in class AbstractLauncher<SparkLauncher>
      Parameters:
      file - Path to the file.
      Returns:
      This launcher.
    • addPyFile

      public SparkLauncher addPyFile(String file)
      Description copied from class: AbstractLauncher
      Adds a python file / zip / egg to be submitted with the application.
      Overrides:
      addPyFile in class AbstractLauncher<SparkLauncher>
      Parameters:
      file - Path to the file.
      Returns:
      This launcher.
    • setVerbose

      public SparkLauncher setVerbose(boolean verbose)
      Description copied from class: AbstractLauncher
      Enables verbose reporting for SparkSubmit.
      Overrides:
      setVerbose in class AbstractLauncher<SparkLauncher>
      Parameters:
      verbose - Whether to enable verbose output.
      Returns:
      This launcher.
    • launch

      public Process launch() throws IOException
      Launches a sub-process that will start the configured Spark application.

      The startApplication(SparkAppHandle.Listener...) method is preferred when launching Spark, since it provides better control of the child application.

      Returns:
      A process handle for the Spark app.
      Throws:
      IOException
    • startApplication

      public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) throws IOException
      Starts a Spark application.

      Applications launched by this launcher run as child processes. The child's stdout and stderr are merged and written to a logger (see java.util.logging) only if redirection has not otherwise been configured on this SparkLauncher. The logger's name can be defined by setting CHILD_PROCESS_LOGGER_NAME in the app's configuration. If that option is not set, the code will try to derive a name from the application's name or main class / script file. If those cannot be determined, an internal, unique name will be used. In all cases, the logger name will start with "org.apache.spark.launcher.app", to fit more easily into the configuration of commonly-used logging systems.

      Specified by:
      startApplication in class AbstractLauncher<SparkLauncher>
      Parameters:
      listeners - Listeners to add to the handle before the app is launched.
      Returns:
      A handle for the launched application.
      Throws:
      IOException
      Since:
      1.6.0
      See Also: