public interface ShuffleDataIO
This is the root of a plugin system for storing shuffle bytes to arbitrary storage
backends in the sort-based shuffle algorithm implemented by the
org.apache.spark.shuffle.sort.SortShuffleManager. If another shuffle algorithm is
needed instead of sort-based shuffle, one should implement
A single instance of this module is loaded per process in the Spark application. The default implementation reads and writes shuffle data from the local disks of the executor, and is the implementation of shuffle file storage that has remained consistent throughout most of Spark's history.
Alternative implementations of shuffle data storage can be loaded via setting
|Modifier and Type
|Method and Description
Called once on driver process to bootstrap the shuffle metadata modules that are maintained by the driver.
Called once on executor processes to bootstrap the shuffle data storage modules that are only invoked on the executors.