Package org.apache.spark.shuffle.api
Interface ShuffleDriverComponents
@Private
public interface ShuffleDriverComponents
:: Private ::
An interface for building shuffle support modules for the Driver.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Called once at the end of the Spark application to clean up any existing shuffle state.Called once in the driver to bootstrap this module that is specific to this application.default void
registerShuffle
(int shuffleId) Called once per shuffle id when the shuffle id is first generated for a shuffle stage.default void
removeShuffle
(int shuffleId, boolean blocking) Removes shuffle data associated with the given shuffle.default boolean
Does this shuffle component support reliable storage - external to the lifecycle of the executor host ?
-
Method Details
-
initializeApplication
Called once in the driver to bootstrap this module that is specific to this application. This method is called before submitting executor requests to the cluster manager. This method should prepare the module with its shuffle components i.e. registering against an external file servers or shuffle services, or creating tables in a shuffle storage data database.- Returns:
- additional SparkConf settings necessary for initializing the executor components. This would include configurations that cannot be statically set on the application, like the host:port of external services for shuffle storage.
-
cleanupApplication
void cleanupApplication()Called once at the end of the Spark application to clean up any existing shuffle state. -
registerShuffle
default void registerShuffle(int shuffleId) Called once per shuffle id when the shuffle id is first generated for a shuffle stage.- Parameters:
shuffleId
- The unique identifier for the shuffle stage.
-
removeShuffle
default void removeShuffle(int shuffleId, boolean blocking) Removes shuffle data associated with the given shuffle.- Parameters:
shuffleId
- The unique identifier for the shuffle stage.blocking
- Whether this call should block on the deletion of the data.
-
supportsReliableStorage
default boolean supportsReliableStorage()Does this shuffle component support reliable storage - external to the lifecycle of the executor host ? For example, writing shuffle data to a distributed filesystem or persisting it in a remote shuffle service.
-