org.apache.spark.rdd.PartitionPruningRDD<T>

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging

public class PartitionPruningRDD<T> extends RDD<T>

:: DeveloperApi :: An RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions. An example use case: If we know the RDD is partitioned by range, and the execution DAG has a filter on the key, we can avoid launching tasks on partitions that don't have the range covering the key.

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

PartitionPruningRDD(RDD<T> prev, scala.Function1<Object,Object> partitionFilterFunc, scala.reflect.ClassTag<T> evidence$1)
Method Summary

Modifier and Type

Method

Description

scala.collection.Iterator<T>

compute(Partition split, TaskContext context)

:: DeveloperApi :: Implemented by subclasses to compute a given partition.

static <T> PartitionPruningRDD<T>

create(RDD<T> rdd, scala.Function1<Object,Object> partitionFilterFunc)

Create a PartitionPruningRDD.

Methods inherited from class org.apache.spark.rdd.RDD
aggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueId

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- PartitionPruningRDD
  
  public PartitionPruningRDD(RDD<T> prev, scala.Function1<Object,Object> partitionFilterFunc, scala.reflect.ClassTag<T> evidence$1)
Method Details
- create
  
  public static <T> PartitionPruningRDD<T> create(RDD<T> rdd, scala.Function1<Object,Object> partitionFilterFunc)
  
  Create a PartitionPruningRDD. This function can be used to create the PartitionPruningRDD when its type T is not known at compile time.
  
  Parameters:
  
  rdd - (undocumented)
  
  partitionFilterFunc - (undocumented)
  
  Returns:
  
  (undocumented)
- compute
  
  public scala.collection.Iterator<T> compute(Partition split, TaskContext context)
  
  Description copied from class: RDD
  
  :: DeveloperApi :: Implemented by subclasses to compute a given partition.
  
  Specified by:
  
  compute in class RDD<T>
  
  Parameters:
  
  split - (undocumented)
  
  context - (undocumented)
  
  Returns:
  
  (undocumented)

Class PartitionPruningRDD<T>

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.rdd.RDD

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

PartitionPruningRDD

Method Details

create

compute