Packages

p

org.apache.spark

streaming

package streaming

Spark Streaming functionality. org.apache.spark.streaming.StreamingContext serves as the main entry point to Spark Streaming, while org.apache.spark.streaming.dstream.DStream is the data type representing a continuous sequence of RDDs, representing a continuous stream of data.

In addition, org.apache.spark.streaming.dstream.PairDStreamFunctions contains operations available only on DStreams of key-value pairs, such as groupByKey and reduceByKey. These operations are automatically available on any DStream of the right type (e.g. DStream[(Int, Int)] through implicit conversions.

For the Java API of Spark Streaming, take a look at the org.apache.spark.streaming.api.java.JavaStreamingContext which serves as the entry point, and the org.apache.spark.streaming.api.java.JavaDStream and the org.apache.spark.streaming.api.java.JavaPairDStream which have the DStream functionality.

Source
package.scala
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. streaming
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class Duration(millis: Long) extends Product with Serializable
  2. sealed abstract class State[S] extends AnyRef

    :: Experimental :: Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).

    :: Experimental :: Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).

    Scala example of using State:

    // A mapping function that maintains an integer state and returns a String
    def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = {
      // Check if state exists
      if (state.exists) {
        val existingState = state.get  // Get the existing state
        val shouldRemove = ...         // Decide whether to remove the state
        if (shouldRemove) {
          state.remove()     // Remove the state
        } else {
          val newState = ...
          state.update(newState)    // Set the new state
        }
      } else {
        val initialState = ...
        state.update(initialState)  // Set the initial state
      }
      ... // return something
    }

    Java example of using State:

    // A mapping function that maintains an integer state and returns a String
    Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction =
       new Function3<String, Optional<Integer>, State<Integer>, String>() {
    
         @Override
         public String call(String key, Optional<Integer> value, State<Integer> state) {
           if (state.exists()) {
             int existingState = state.get(); // Get the existing state
             boolean shouldRemove = ...; // Decide whether to remove the state
             if (shouldRemove) {
               state.remove(); // Remove the state
             } else {
               int newState = ...;
               state.update(newState); // Set the new state
             }
           } else {
             int initialState = ...; // Set the initial state
             state.update(initialState);
           }
           // return something
         }
       };
    S

    Class of the state

    Annotations
    @Experimental()
  3. sealed abstract class StateSpec[KeyType, ValueType, StateType, MappedType] extends Serializable

    :: Experimental :: Abstract class representing all the specifications of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).

    :: Experimental :: Abstract class representing all the specifications of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java). Use org.apache.spark.streaming.StateSpec.function() factory methods to create instances of this class.

    Example in Scala:

    // A mapping function that maintains an integer state and return a String
    def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = {
      // Use state.exists(), state.get(), state.update() and state.remove()
      // to manage state, and return the necessary string
    }
    
    val spec = StateSpec.function(mappingFunction).numPartitions(10)
    
    val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)

    Example in Java:

    // A mapping function that maintains an integer state and return a string
    Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction =
        new Function3<String, Optional<Integer>, State<Integer>, String>() {
            @Override
            public Optional<String> call(Optional<Integer> value, State<Integer> state) {
                // Use state.exists(), state.get(), state.update() and state.remove()
                // to manage state, and return the necessary string
            }
        };
    
     JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream =
         keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
    KeyType

    Class of the state key

    ValueType

    Class of the state value

    StateType

    Class of the state data

    MappedType

    Class of the mapped elements

    Annotations
    @Experimental()
  4. sealed abstract final class StreamingContextState extends Enum[StreamingContextState]
    Annotations
    @DeveloperApi()
  5. case class Time(millis: Long) extends Product with Serializable

    This is a simple class that represents an absolute instant of time.

    This is a simple class that represents an absolute instant of time. Internally, it represents time as the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC. This is the same format as what is returned by System.currentTimeMillis.

  6. class StreamingContext extends Logging

    Main entry point for Spark Streaming functionality.

    Main entry point for Spark Streaming functionality. It provides methods used to create org.apache.spark.streaming.dstream.DStreams from various input sources. It can be either created by providing a Spark master URL and an appName, or from a org.apache.spark.SparkConf configuration (see core Spark documentation), or from an existing org.apache.spark.SparkContext. The associated SparkContext can be accessed using context.sparkContext. After creating and transforming DStreams, the streaming computation can be started and stopped using context.start() and context.stop(), respectively. context.awaitTermination() allows the current thread to wait for the termination of the context by stop() or by an exception.

    Annotations
    @deprecated
    Deprecated

    (Since version Spark 3.4.0) DStream is deprecated. Migrate to Structured Streaming.

Value Members

  1. object Durations
  2. object Milliseconds

    Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of milliseconds.

  3. object Minutes

    Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of minutes.

  4. object Seconds

    Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of seconds.

  5. object StateSpec extends Serializable

    :: Experimental :: Builder object for creating instances of org.apache.spark.streaming.StateSpec that is used for specifying the parameters of the DStream transformation mapWithState that is used for specifying the parameters of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).

    :: Experimental :: Builder object for creating instances of org.apache.spark.streaming.StateSpec that is used for specifying the parameters of the DStream transformation mapWithState that is used for specifying the parameters of the DStream transformation mapWithState operation of a pair DStream (Scala) or a JavaPairDStream (Java).

    Example in Scala:

    // A mapping function that maintains an integer state and return a String
    def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = {
      // Use state.exists(), state.get(), state.update() and state.remove()
      // to manage state, and return the necessary string
    }
    
    val spec = StateSpec.function(mappingFunction).numPartitions(10)
    
    val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)

    Example in Java:

    // A mapping function that maintains an integer state and return a string
    Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction =
        new Function3<String, Optional<Integer>, State<Integer>, String>() {
            @Override
            public Optional<String> call(Optional<Integer> value, State<Integer> state) {
                // Use state.exists(), state.get(), state.update() and state.remove()
                // to manage state, and return the necessary string
            }
        };
    
     JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream =
         keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
    Annotations
    @Experimental()
  6. object StreamingConf
  7. object Time extends Serializable

Deprecated Value Members

  1. object StreamingContext extends Logging

    StreamingContext object contains a number of utility functions related to the StreamingContext class.

    StreamingContext object contains a number of utility functions related to the StreamingContext class.

    Annotations
    @deprecated
    Deprecated

    (Since version Spark 3.4.0) DStream is deprecated. Migrate to Structured Streaming.

Inherited from AnyRef

Inherited from Any

Ungrouped