Class PartitionPruningRDD<T>

Object
org.apache.spark.rdd.RDD<T>
org.apache.spark.rdd.PartitionPruningRDD<T>
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, scala.Serializable

public class PartitionPruningRDD<T> extends RDD<T>
:: DeveloperApi :: An RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions. An example use case: If we know the RDD is partitioned by range, and the execution DAG has a filter on the key, we can avoid launching tasks on partitions that don't have the range covering the key.
See Also:
  • Constructor Details

    • PartitionPruningRDD

      public PartitionPruningRDD(RDD<T> prev, scala.Function1<Object,Object> partitionFilterFunc, scala.reflect.ClassTag<T> evidence$1)
  • Method Details

    • create

      public static <T> PartitionPruningRDD<T> create(RDD<T> rdd, scala.Function1<Object,Object> partitionFilterFunc)
      Create a PartitionPruningRDD. This function can be used to create the PartitionPruningRDD when its type T is not known at compile time.
      Parameters:
      rdd - (undocumented)
      partitionFilterFunc - (undocumented)
      Returns:
      (undocumented)
    • compute

      public scala.collection.Iterator<T> compute(Partition split, TaskContext context)
      Description copied from class: RDD
      :: DeveloperApi :: Implemented by subclasses to compute a given partition.
      Specified by:
      compute in class RDD<T>
      Parameters:
      split - (undocumented)
      context - (undocumented)
      Returns:
      (undocumented)