Class StateSpec<KeyType,ValueType,StateType,MappedType>

Object
org.apache.spark.streaming.StateSpec<KeyType,ValueType,StateType,MappedType>
Type Parameters:
KeyType - Class of the state key
ValueType - Class of the state value
StateType - Class of the state data
MappedType - Class of the mapped elements
All Implemented Interfaces:
Serializable

public abstract class StateSpec<KeyType,ValueType,StateType,MappedType> extends Object implements Serializable
:: Experimental :: Abstract class representing all the specifications of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java). Use org.apache.spark.streaming.StateSpec.function() factory methods to create instances of this class.

Example in Scala:


    // A mapping function that maintains an integer state and return a String
    def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = {
      // Use state.exists(), state.get(), state.update() and state.remove()
      // to manage state, and return the necessary string
    }

    val spec = StateSpec.function(mappingFunction).numPartitions(10)

    val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)
 

Example in Java:


   // A mapping function that maintains an integer state and return a string
   Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction =
       new Function3<String, Optional<Integer>, State<Integer>, String>() {
           @Override
           public Optional<String> call(Optional<Integer> value, State<Integer> state) {
               // Use state.exists(), state.get(), state.update() and state.remove()
               // to manage state, and return the necessary string
           }
       };

    JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream =
        keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
 

See Also:
  • Constructor Details

    • StateSpec

      public StateSpec()
  • Method Details

    • function

      public static <KeyType, ValueType, StateType, MappedType> StateSpec<KeyType,ValueType,StateType,MappedType> function(scala.Function4<Time,KeyType,scala.Option<ValueType>,State<StateType>,scala.Option<MappedType>> mappingFunction)
      Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.

      Parameters:
      mappingFunction - The function applied on every data item to manage the associated state and generate the mapped data
      Returns:
      (undocumented)
    • function

      public static <KeyType, ValueType, StateType, MappedType> StateSpec<KeyType,ValueType,StateType,MappedType> function(scala.Function3<KeyType,scala.Option<ValueType>,State<StateType>,MappedType> mappingFunction)
      Create a StateSpec for setting all the specifications of the mapWithState operation on a pair DStream.

      Parameters:
      mappingFunction - The function applied on every data item to manage the associated state and generate the mapped data
      Returns:
      (undocumented)
    • function

      public static <KeyType, ValueType, StateType, MappedType> StateSpec<KeyType,ValueType,StateType,MappedType> function(Function4<Time,KeyType,Optional<ValueType>,State<StateType>,Optional<MappedType>> mappingFunction)
      Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.

      Parameters:
      mappingFunction - The function applied on every data item to manage the associated state and generate the mapped data
      Returns:
      (undocumented)
    • function

      public static <KeyType, ValueType, StateType, MappedType> StateSpec<KeyType,ValueType,StateType,MappedType> function(Function3<KeyType,Optional<ValueType>,State<StateType>,MappedType> mappingFunction)
      Create a StateSpec for setting all the specifications of the mapWithState operation on a JavaPairDStream.

      Parameters:
      mappingFunction - The function applied on every data item to manage the associated state and generate the mapped data
      Returns:
      (undocumented)
    • initialState

      public abstract StateSpec<KeyType,ValueType,StateType,MappedType> initialState(RDD<scala.Tuple2<KeyType,StateType>> rdd)
      Set the RDD containing the initial states that will be used by mapWithState
      Parameters:
      rdd - (undocumented)
      Returns:
      (undocumented)
    • initialState

      public abstract StateSpec<KeyType,ValueType,StateType,MappedType> initialState(JavaPairRDD<KeyType,StateType> javaPairRDD)
      Set the RDD containing the initial states that will be used by mapWithState
      Parameters:
      javaPairRDD - (undocumented)
      Returns:
      (undocumented)
    • numPartitions

      public abstract StateSpec<KeyType,ValueType,StateType,MappedType> numPartitions(int numPartitions)
      Set the number of partitions by which the state RDDs generated by mapWithState will be partitioned. Hash partitioning will be used.
      Parameters:
      numPartitions - (undocumented)
      Returns:
      (undocumented)
    • partitioner

      public abstract StateSpec<KeyType,ValueType,StateType,MappedType> partitioner(Partitioner partitioner)
      Set the partitioner by which the state RDDs generated by mapWithState will be partitioned.
      Parameters:
      partitioner - (undocumented)
      Returns:
      (undocumented)
    • timeout

      public abstract StateSpec<KeyType,ValueType,StateType,MappedType> timeout(Duration idleDuration)
      Set the duration after which the state of an idle key will be removed. A key and its state is considered idle if it has not received any data for at least the given duration. The mapping function will be called one final time on the idle states that are going to be removed; State.isTimingOut() set to true in that call.
      Parameters:
      idleDuration - (undocumented)
      Returns:
      (undocumented)