org.apache.spark.rdd
Class PartitionPruningRDD<T>

Object
  extended by org.apache.spark.rdd.RDD<T>
      extended by org.apache.spark.rdd.PartitionPruningRDD<T>
All Implemented Interfaces:
java.io.Serializable, Logging

public class PartitionPruningRDD<T>
extends RDD<T>

:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions. An example use case: If we know the RDD is partitioned by range, and the execution DAG has a filter on the key, we can avoid launching tasks on partitions that don't have the range covering the key.

See Also:
Serialized Form

Constructor Summary
PartitionPruningRDD(RDD<T> prev, scala.Function1<Object,Object> partitionFilterFunc, scala.reflect.ClassTag<T> evidence$1)
           
 
Method Summary
 scala.collection.Iterator<T> compute(Partition split, TaskContext context)
          :: DeveloperApi :: Implemented by subclasses to compute a given partition.
static
<T> PartitionPruningRDD<T>
create(RDD<T> rdd, scala.Function1<Object,Object> partitionFilterFunc)
          Create a PartitionPruningRDD.
 
Methods inherited from class org.apache.spark.rdd.RDD
aggregate, cache, cartesian, checkpoint, checkpointData, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, creationSite, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, filterWith, first, flatMap, flatMapWith, fold, foreach, foreachPartition, foreachWith, getCheckpointFile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, map, mapPartitions, mapPartitionsWithContext, mapPartitionsWithIndex, mapPartitionsWithSplit, mapWith, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, scope, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toArray, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeReduce, union, unpersist, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipWithIndex, zipWithUniqueId
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

PartitionPruningRDD

public PartitionPruningRDD(RDD<T> prev,
                           scala.Function1<Object,Object> partitionFilterFunc,
                           scala.reflect.ClassTag<T> evidence$1)
Method Detail

create

public static <T> PartitionPruningRDD<T> create(RDD<T> rdd,
                                                scala.Function1<Object,Object> partitionFilterFunc)
Create a PartitionPruningRDD. This function can be used to create the PartitionPruningRDD when its type T is not known at compile time.

Parameters:
rdd - (undocumented)
partitionFilterFunc - (undocumented)
Returns:
(undocumented)

compute

public scala.collection.Iterator<T> compute(Partition split,
                                            TaskContext context)
Description copied from class: RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.

Specified by:
compute in class RDD<T>
Parameters:
split - (undocumented)
context - (undocumented)
Returns:
(undocumented)