Packages

class BarrierTaskContext extends TaskContext with Logging

Experimental

A TaskContext with extra contextual info and tooling for tasks in a barrier stage. Use BarrierTaskContext#get to obtain the barrier context for a running barrier task.

Annotations
@Experimental() @Since( "2.4.0" )
Source
BarrierTaskContext.scala
Linear Supertypes
Logging, TaskContext, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BarrierTaskContext
  2. Logging
  3. TaskContext
  4. Serializable
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def addTaskCompletionListener(listener: TaskCompletionListener): BarrierTaskContext.this.type

    Adds a (Java friendly) listener to be executed on task completion.

    Adds a (Java friendly) listener to be executed on task completion. This will be called in all situations - success, failure, or cancellation. Adding a listener to an already completed task will result in that listener being called immediately.

    Two listeners registered in the same thread will be invoked in reverse order of registration if the task completes after both are registered. There are no ordering guarantees for listeners registered in different threads, or for listeners registered after the task completes. Listeners are guaranteed to execute sequentially.

    An example use is for HadoopRDD to register a callback to close the input stream.

    Exceptions thrown by the listener will result in failure of the task.

    Definition Classes
    BarrierTaskContextTaskContext
  2. def addTaskCompletionListener[U](f: (TaskContext) ⇒ U): TaskContext

    Adds a listener in the form of a Scala closure to be executed on task completion.

    Adds a listener in the form of a Scala closure to be executed on task completion. This will be called in all situations - success, failure, or cancellation. Adding a listener to an already completed task will result in that listener being called immediately.

    An example use is for HadoopRDD to register a callback to close the input stream.

    Exceptions thrown by the listener will result in failure of the task.

    Definition Classes
    TaskContext
  3. def addTaskFailureListener(listener: TaskFailureListener): BarrierTaskContext.this.type

    Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail).

    Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail). Adding a listener to an already failed task will result in that listener being called immediately.

    Note: Prior to Spark 3.4.0, failure listeners were only invoked if the main task body failed.

    Definition Classes
    BarrierTaskContextTaskContext
  4. def addTaskFailureListener(f: (TaskContext, Throwable) ⇒ Unit): TaskContext

    Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail).

    Adds a listener to be executed on task failure (which includes completion listener failure, if the task body did not already fail). Adding a listener to an already failed task will result in that listener being called immediately.

    Note: Prior to Spark 3.4.0, failure listeners were only invoked if the main task body failed.

    Definition Classes
    TaskContext
  5. def allGather(message: String): Array[String]

    Blocks until all tasks in the same stage have reached this routine.

    Blocks until all tasks in the same stage have reached this routine. Each task passes in a message and returns with a list of all the messages passed in by each of those tasks.

    CAUTION! The allGather method requires the same precautions as the barrier method

    The message is type String rather than Array[Byte] because it is more convenient for the user at the cost of worse performance.

    Annotations
    @Experimental() @Since( "3.0.0" )
  6. def attemptNumber(): Int

    How many times this task has been attempted.

    How many times this task has been attempted. The first task attempt will be assigned attemptNumber = 0, and subsequent attempts will have increasing attempt numbers.

    Definition Classes
    BarrierTaskContextTaskContext
  7. def barrier(): Unit

    Sets a global barrier and waits until all tasks in this stage hit this barrier.

    Sets a global barrier and waits until all tasks in this stage hit this barrier. Similar to MPI_Barrier function in MPI, the barrier() function call blocks until all tasks in the same stage have reached this routine.

    CAUTION! In a barrier stage, each task must have the same number of barrier() calls, in all possible code branches. Otherwise, you may get the job hanging or a SparkException after timeout. Some examples of misuses are listed below: 1. Only call barrier() function on a subset of all the tasks in the same barrier stage, it shall lead to timeout of the function call.

    rdd.barrier().mapPartitions { iter =>
        val context = BarrierTaskContext.get()
        if (context.partitionId() == 0) {
            // Do nothing.
        } else {
            context.barrier()
        }
        iter
    }

    2. Include barrier() function in a try-catch code block, this may lead to timeout of the second function call.

    rdd.barrier().mapPartitions { iter =>
        val context = BarrierTaskContext.get()
        try {
            // Do something that might throw an Exception.
            doSomething()
            context.barrier()
        } catch {
            case e: Exception => logWarning("...", e)
        }
        context.barrier()
        iter
    }
    Annotations
    @Experimental() @Since( "2.4.0" )
  8. def cpus(): Int

    CPUs allocated to the task.

    CPUs allocated to the task.

    Definition Classes
    BarrierTaskContextTaskContext
  9. def getLocalProperty(key: String): String

    Get a local property set upstream in the driver, or null if it is missing.

    Get a local property set upstream in the driver, or null if it is missing. See also org.apache.spark.SparkContext.setLocalProperty.

    Definition Classes
    BarrierTaskContextTaskContext
  10. def getMetricsSources(sourceName: String): Seq[Source]

    ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task.

    ::DeveloperApi:: Returns all metrics sources with the given name which are associated with the instance which runs the task. For more information see org.apache.spark.metrics.MetricsSystem.

    Definition Classes
    BarrierTaskContextTaskContext
  11. def getTaskInfos(): Array[BarrierTaskInfo]

    Returns BarrierTaskInfo for all tasks in this barrier stage, ordered by partition ID.

    Returns BarrierTaskInfo for all tasks in this barrier stage, ordered by partition ID.

    Annotations
    @Experimental() @Since( "2.4.0" )
  12. def isCompleted(): Boolean

    Returns true if the task has completed.

    Returns true if the task has completed.

    Definition Classes
    BarrierTaskContextTaskContext
  13. def isFailed(): Boolean

    Returns true if the task has failed.

    Returns true if the task has failed.

    Definition Classes
    BarrierTaskContextTaskContext
  14. def isInterrupted(): Boolean

    Returns true if the task has been killed.

    Returns true if the task has been killed.

    Definition Classes
    BarrierTaskContextTaskContext
  15. def numPartitions(): Int

    Total number of partitions in the stage that this task belongs to.

    Total number of partitions in the stage that this task belongs to.

    Definition Classes
    BarrierTaskContextTaskContext
  16. def partitionId(): Int

    The ID of the RDD partition that is computed by this task.

    The ID of the RDD partition that is computed by this task.

    Definition Classes
    BarrierTaskContextTaskContext
  17. def resources(): Map[String, ResourceInformation]

    Resources allocated to the task.

    Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer to org.apache.spark.resource.ResourceInformation for specifics.

    Definition Classes
    BarrierTaskContextTaskContext
  18. def resourcesJMap(): Map[String, ResourceInformation]

    (java-specific) Resources allocated to the task.

    (java-specific) Resources allocated to the task. The key is the resource name and the value is information about the resource. Please refer to org.apache.spark.resource.ResourceInformation for specifics.

    Definition Classes
    BarrierTaskContextTaskContext
  19. def stageAttemptNumber(): Int

    How many times the stage that this task belongs to has been attempted.

    How many times the stage that this task belongs to has been attempted. The first stage attempt will be assigned stageAttemptNumber = 0, and subsequent attempts will have increasing attempt numbers.

    Definition Classes
    BarrierTaskContextTaskContext
  20. def stageId(): Int

    The ID of the stage that this task belong to.

    The ID of the stage that this task belong to.

    Definition Classes
    BarrierTaskContextTaskContext
  21. def taskAttemptId(): Long

    An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID).

    An ID that is unique to this task attempt (within the same SparkContext, no two task attempts will share the same attempt ID). This is roughly equivalent to Hadoop's TaskAttemptID.

    Definition Classes
    BarrierTaskContextTaskContext
  22. def taskMetrics(): TaskMetrics
    Definition Classes
    BarrierTaskContextTaskContext