org.apache.spark.launcher (Spark 3.5.5 JavaDoc)

Interface Summary
Interface Description

SparkAppHandle
A handle to a running Spark application.

SparkAppHandle.Listener
Listener for updates to a handle's state.

Interface Summary
Interface	Description
SparkAppHandle	A handle to a running Spark application.
SparkAppHandle.Listener	Listener for updates to a handle's state.

Class Summary
Class	Description
AbstractLauncher<T extends AbstractLauncher<T>>	Base class for launcher implementations.
InProcessLauncher	In-process launcher for Spark applications.
JavaModuleOptions	This helper class is used to place the all `--add-opens` options required by Spark when using Java 17.
SparkLauncher	Launcher for Spark applications.

Enum Summary
Enum Description

SparkAppHandle.State
Represents the application's state.

Enum Summary
Enum	Description
SparkAppHandle.State	Represents the application's state.

Package org.apache.spark.launcher Description

Library for launching Spark applications programmatically.

There are two ways to start applications with this library: as a child process, using SparkLauncher, or in-process, using InProcessLauncher.

The AbstractLauncher.startApplication( org.apache.spark.launcher.SparkAppHandle.Listener...) method can be used to start Spark and provide a handle to monitor and control the running application:

 
   import org.apache.spark.launcher.SparkAppHandle;
   import org.apache.spark.launcher.SparkLauncher;

   public class MyLauncher {
     public static void main(String[] args) throws Exception {
       SparkAppHandle handle = new SparkLauncher()
         .setAppResource("/my/app.jar")
         .setMainClass("my.spark.app.Main")
         .setMaster("local")
         .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
         .startApplication();
       // Use handle API to monitor / control application.
     }
   }

Launching applications as a child process requires a full Spark installation. The installation directory can be provided to the launcher explicitly in the launcher's configuration, or by setting the SPARK_HOME environment variable.

Launching applications in-process is only recommended in cluster mode, since Spark cannot run multiple client-mode applications concurrently in the same process. The in-process launcher requires the necessary Spark dependencies (such as spark-core and cluster manager-specific modules) to be present in the caller thread's class loader.

It's also possible to launch a raw child process, without the extra monitoring, using the SparkLauncher.launch() method:

 
   import org.apache.spark.launcher.SparkLauncher;

   public class MyLauncher {
     public static void main(String[] args) throws Exception {
       Process spark = new SparkLauncher()
         .setAppResource("/my/app.jar")
         .setMainClass("my.spark.app.Main")
         .setMaster("local")
         .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
         .launch();
       spark.waitFor();
     }
   }

This method requires the calling code to manually manage the child process, including its output streams (to avoid possible deadlocks). It's recommended that SparkLauncher.startApplication( org.apache.spark.launcher.SparkAppHandle.Listener...) be used instead.