Package org.apache.spark.rdd
Class PartitionPruningRDD<T>
Object
org.apache.spark.rdd.RDD<T>
org.apache.spark.rdd.PartitionPruningRDD<T>
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
:: DeveloperApi ::
An RDD used to prune RDD partitions/partitions so we can avoid launching tasks on
all partitions. An example use case: If we know the RDD is partitioned by range,
and the execution DAG has a filter on the key, we can avoid launching tasks
on partitions that don't have the range covering the key.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionscala.collection.Iterator<T>
compute
(Partition split, TaskContext context) :: DeveloperApi :: Implemented by subclasses to compute a given partition.static <T> PartitionPruningRDD<T>
Create a PartitionPruningRDD.Methods inherited from class org.apache.spark.rdd.RDD
aggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueId
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
PartitionPruningRDD
-
-
Method Details
-
create
public static <T> PartitionPruningRDD<T> create(RDD<T> rdd, scala.Function1<Object, Object> partitionFilterFunc) Create a PartitionPruningRDD. This function can be used to create the PartitionPruningRDD when its type T is not known at compile time.- Parameters:
rdd
- (undocumented)partitionFilterFunc
- (undocumented)- Returns:
- (undocumented)
-
compute
Description copied from class:RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
-