Package org.apache.spark
Class ShuffleDependency<K,V,C>
Object
org.apache.spark.Dependency<scala.Product2<K,V>>
org.apache.spark.ShuffleDependency<K,V,C>
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
public class ShuffleDependency<K,V,C>
extends Dependency<scala.Product2<K,V>>
implements org.apache.spark.internal.Logging
:: DeveloperApi ::
Represents a dependency on the output of a shuffle stage. Note that in the case of shuffle,
the RDD is transient since we don't need it on the executor side.
param: _rdd the parent RDD
param: partitioner partitioner used to partition the shuffle output
param: serializer Serializer
to use. If not set
explicitly then the default serializer, as specified by spark.serializer
config option, will be used.
param: keyOrdering key ordering for RDD's shuffles
param: aggregator map/reduce-side aggregator for RDD's shuffle
param: mapSideCombine whether to perform partial aggregation (also known as map-side combine)
param: shuffleWriterProcessor the processor to control the write behavior in ShuffleMapTask
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
ConstructorDescriptionShuffleDependency
(RDD<? extends scala.Product2<K, V>> _rdd, Partitioner partitioner, Serializer serializer, scala.Option<scala.math.Ordering<K>> keyOrdering, scala.Option<Aggregator<K, V, C>> aggregator, boolean mapSideCombine, org.apache.spark.shuffle.ShuffleWriteProcessor shuffleWriterProcessor, scala.reflect.ClassTag<K> evidence$1, scala.reflect.ClassTag<V> evidence$2, scala.reflect.ClassTag<C> evidence$3) -
Method Summary
Modifier and TypeMethodDescriptionscala.Option<Aggregator<K,
V, C>> scala.collection.immutable.Seq<BlockManagerId>
scala.Option<scala.math.Ordering<K>>
boolean
void
rdd()
void
setMergerLocs
(scala.collection.immutable.Seq<BlockManagerId> mergerLocs) org.apache.spark.shuffle.ShuffleHandle
int
boolean
boolean
boolean
Returns true if push-based shuffle is disabled or if the shuffle merge for this shuffle is finalized.int
shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.org.apache.spark.shuffle.ShuffleWriteProcessor
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
ShuffleDependency
public ShuffleDependency(RDD<? extends scala.Product2<K, V>> _rdd, Partitioner partitioner, Serializer serializer, scala.Option<scala.math.Ordering<K>> keyOrdering, scala.Option<Aggregator<K, V, C>> aggregator, boolean mapSideCombine, org.apache.spark.shuffle.ShuffleWriteProcessor shuffleWriterProcessor, scala.reflect.ClassTag<K> evidence$1, scala.reflect.ClassTag<V> evidence$2, scala.reflect.ClassTag<C> evidence$3)
-
-
Method Details
-
partitioner
-
serializer
-
keyOrdering
-
aggregator
-
mapSideCombine
public boolean mapSideCombine() -
shuffleWriterProcessor
public org.apache.spark.shuffle.ShuffleWriteProcessor shuffleWriterProcessor() -
rdd
- Specified by:
rdd
in classDependency<scala.Product2<K,
V>>
-
shuffleId
public int shuffleId() -
shuffleHandle
public org.apache.spark.shuffle.ShuffleHandle shuffleHandle() -
shuffleMergeEnabled
public boolean shuffleMergeEnabled() -
shuffleMergeAllowed
public boolean shuffleMergeAllowed() -
shuffleMergeId
public int shuffleMergeId()shuffleMergeId is used to uniquely identify merging process of shuffle by an indeterminate stage attempt.- Returns:
- (undocumented)
-
setMergerLocs
-
getMergerLocs
-
shuffleMergeFinalized
public boolean shuffleMergeFinalized()Returns true if push-based shuffle is disabled or if the shuffle merge for this shuffle is finalized.- Returns:
- (undocumented)
-
newShuffleMergeState
public void newShuffleMergeState()
-