Packages

c

org.apache.spark.rdd

RDDBarrier

class RDDBarrier[T] extends AnyRef

Experimental

Wraps an RDD in a barrier stage, which forces Spark to launch tasks of this stage together. org.apache.spark.rdd.RDDBarrier instances are created by org.apache.spark.rdd.RDD#barrier.

Annotations
@Experimental() @Since( "2.4.0" )
Source
RDDBarrier.scala
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RDDBarrier
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def mapPartitions[S](f: (Iterator[T]) ⇒ Iterator[S], preservesPartitioning: Boolean = false)(implicit arg0: ClassTag[S]): RDD[S]

    Returns a new RDD by applying a function to each partition of the wrapped RDD, where tasks are launched together in a barrier stage.

    Returns a new RDD by applying a function to each partition of the wrapped RDD, where tasks are launched together in a barrier stage. The interface is the same as org.apache.spark.rdd.RDD#mapPartitions. Please see the API doc there.

    Annotations
    @Experimental() @Since( "2.4.0" )
    See also

    org.apache.spark.BarrierTaskContext

  2. def mapPartitionsWithEvaluator[U](evaluatorFactory: PartitionEvaluatorFactory[T, U])(implicit arg0: ClassTag[U]): RDD[U]

    Return a new RDD by applying an evaluator to each partition of the wrapped RDD.

    Return a new RDD by applying an evaluator to each partition of the wrapped RDD. The given evaluator factory will be serialized and sent to executors, and each task will create an evaluator with the factory, and use the evaluator to transform the data of the input partition.

    Annotations
    @DeveloperApi() @Since( "3.5.0" )
  3. def mapPartitionsWithIndex[S](f: (Int, Iterator[T]) ⇒ Iterator[S], preservesPartitioning: Boolean = false)(implicit arg0: ClassTag[S]): RDD[S]

    Returns a new RDD by applying a function to each partition of the wrapped RDD, while tracking the index of the original partition.

    Returns a new RDD by applying a function to each partition of the wrapped RDD, while tracking the index of the original partition. And all tasks are launched together in a barrier stage. The interface is the same as org.apache.spark.rdd.RDD#mapPartitionsWithIndex. Please see the API doc there.

    Annotations
    @Experimental() @Since( "3.0.0" )
    See also

    org.apache.spark.BarrierTaskContext