JavaStreamingContext (Spark 1.2.1 JavaDoc)

Object
- org.apache.spark.streaming.api.java.JavaStreamingContext

All Implemented Interfaces:

java.io.Closeable, AutoCloseable
```
public class JavaStreamingContext
extends Object
implements java.io.Closeable
```
A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality. It provides methods to create JavaDStream and JavaPairDStream$ from input sources. The internal org.apache.spark.api.java.JavaSparkContext (see core Spark documentation) can be accessed using context.sparkContext. After creating and transforming DStreams, the streaming computation can be started and stopped using context.start() and context.stop(), respectively. context.awaitTermination() allows the current thread to wait for the termination of a context by stop() or by an exception.

Constructor Summary

Constructors
Constructor and Description
`JavaStreamingContext(JavaSparkContext sparkContext, Duration batchDuration)` Create a JavaStreamingContext using an existing JavaSparkContext.
`JavaStreamingContext(SparkConf conf, Duration batchDuration)` Create a JavaStreamingContext using a SparkConf configuration.
`JavaStreamingContext(StreamingContext ssc)`
`JavaStreamingContext(String path)` Recreate a JavaStreamingContext from a checkpoint file.
`JavaStreamingContext(String path, org.apache.hadoop.conf.Configuration hadoopConf)` Re-creates a JavaStreamingContext from a checkpoint file.
`JavaStreamingContext(String master, String appName, Duration batchDuration)` Create a StreamingContext.
`JavaStreamingContext(String master, String appName, Duration batchDuration, String sparkHome, String jarFile)` Create a StreamingContext.
`JavaStreamingContext(String master, String appName, Duration batchDuration, String sparkHome, String[] jars)` Create a StreamingContext.
`JavaStreamingContext(String master, String appName, Duration batchDuration, String sparkHome, String[] jars, java.util.Map<String,String> environment)` Create a StreamingContext.

Method Summary

Methods
Modifier and Type	Method and Description
`<T> JavaReceiverInputDStream<T>`	`actorStream(akka.actor.Props props, String name)` Create an input stream with any arbitrary user implemented actor receiver.
`<T> JavaReceiverInputDStream<T>`	`actorStream(akka.actor.Props props, String name, StorageLevel storageLevel)` Create an input stream with any arbitrary user implemented actor receiver.
`<T> JavaReceiverInputDStream<T>`	`actorStream(akka.actor.Props props, String name, StorageLevel storageLevel, akka.actor.SupervisorStrategy supervisorStrategy)` Create an input stream with any arbitrary user implemented actor receiver.
`void`	`addStreamingListener(StreamingListener streamingListener)` Add a `StreamingListener` object for receiving system events related to streaming.
`void`	`awaitTermination()` Wait for the execution to stop.
`void`	`awaitTermination(long timeout)` Wait for the execution to stop.
`void`	`checkpoint(String directory)` Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
`void`	`close()`
`<K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> JavaPairInputDStream<K,V>`	`fileStream(String directory)` Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format.
`static JavaStreamingContext`	`getOrCreate(String checkpointPath, org.apache.hadoop.conf.Configuration hadoopConf, JavaStreamingContextFactory factory)` Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
`static JavaStreamingContext`	`getOrCreate(String checkpointPath, org.apache.hadoop.conf.Configuration hadoopConf, JavaStreamingContextFactory factory, boolean createOnError)` Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
`static JavaStreamingContext`	`getOrCreate(String checkpointPath, JavaStreamingContextFactory factory)` Either recreate a StreamingContext from checkpoint data or create a new StreamingContext.
`static String[]`	`jarOfClass(Class<?> cls)` Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
`<T> JavaDStream<T>`	`queueStream(java.util.Queue<JavaRDD<T>> queue)` Create an input stream from an queue of RDDs.
`<T> JavaInputDStream<T>`	`queueStream(java.util.Queue<JavaRDD<T>> queue, boolean oneAtATime)` Create an input stream from an queue of RDDs.
`<T> JavaInputDStream<T>`	`queueStream(java.util.Queue<JavaRDD<T>> queue, boolean oneAtATime, JavaRDD<T> defaultRDD)` Create an input stream from an queue of RDDs.
`<T> JavaReceiverInputDStream<T>`	`rawSocketStream(String hostname, int port)` Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
`<T> JavaReceiverInputDStream<T>`	`rawSocketStream(String hostname, int port, StorageLevel storageLevel)` Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them.
`<T> JavaReceiverInputDStream<T>`	`receiverStream(Receiver<T> receiver)` Create an input stream with any arbitrary user implemented receiver.
`void`	`remember(Duration duration)` Sets each DStreams in this context to remember RDDs it generated in the last given duration.
`JavaSparkContext`	`sc()`
`<T> JavaReceiverInputDStream<T>`	`socketStream(String hostname, int port, Function<java.io.InputStream,Iterable<T>> converter, StorageLevel storageLevel)` Create an input stream from network source hostname:port.
`JavaReceiverInputDStream<String>`	`socketTextStream(String hostname, int port)` Create an input stream from network source hostname:port.
`JavaReceiverInputDStream<String>`	`socketTextStream(String hostname, int port, StorageLevel storageLevel)` Create an input stream from network source hostname:port.
`JavaSparkContext`	`sparkContext()` The underlying SparkContext
`StreamingContext`	`ssc()`
`void`	`start()` Start the execution of the streams.
`void`	`stop()` Stop the execution of the streams.
`void`	`stop(boolean stopSparkContext)` Stop the execution of the streams.
`void`	`stop(boolean stopSparkContext, boolean stopGracefully)` Stop the execution of the streams.
`JavaDStream<String>`	`textFileStream(String directory)` Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat).
`<T> JavaDStream<T>`	`transform(java.util.List<JavaDStream<?>> dstreams, Function2<java.util.List<JavaRDD<?>>,Time,JavaRDD<T>> transformFunc)` Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
`<K,V> JavaPairDStream<K,V>`	`transformToPair(java.util.List<JavaDStream<?>> dstreams, Function2<java.util.List<JavaRDD<?>>,Time,JavaPairRDD<K,V>> transformFunc)` Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams.
`<T> JavaDStream<T>`	`union(JavaDStream<T> first, java.util.List<JavaDStream<T>> rest)` Create a unified DStream from multiple DStreams of the same type and same slide duration.
`<K,V> JavaPairDStream<K,V>`	`union(JavaPairDStream<K,V> first, java.util.List<JavaPairDStream<K,V>> rest)` Create a unified DStream from multiple DStreams of the same type and same slide duration.

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - JavaStreamingContext
```
public JavaStreamingContext(StreamingContext ssc)
```
  - JavaStreamingContext
```
public JavaStreamingContext(String master,
                    String appName,
                    Duration batchDuration)
```
    Create a StreamingContext.
    
    Parameters:
    master - Name of the Spark Master
    appName - Name to be used when registering with the scheduler
    batchDuration - The time interval at which streaming data will be divided into batches
  - JavaStreamingContext
```
public JavaStreamingContext(String master,
                    String appName,
                    Duration batchDuration,
                    String sparkHome,
                    String jarFile)
```
    Create a StreamingContext.
    
    Parameters:
    master - Name of the Spark Master
    appName - Name to be used when registering with the scheduler
    batchDuration - The time interval at which streaming data will be divided into batches
    sparkHome - The SPARK_HOME directory on the slave nodes
    jarFile - JAR file containing job code, to ship to cluster. This can be a path on the local file system or an HDFS, HTTP, HTTPS, or FTP URL.
  - JavaStreamingContext
```
public JavaStreamingContext(String master,
                    String appName,
                    Duration batchDuration,
                    String sparkHome,
                    String[] jars)
```
    Create a StreamingContext.
    
    Parameters:
    master - Name of the Spark Master
    appName - Name to be used when registering with the scheduler
    batchDuration - The time interval at which streaming data will be divided into batches
    sparkHome - The SPARK_HOME directory on the slave nodes
    jars - Collection of JARs to send to the cluster. These can be paths on the local file system or HDFS, HTTP, HTTPS, or FTP URLs.
  - JavaStreamingContext
```
public JavaStreamingContext(String master,
                    String appName,
                    Duration batchDuration,
                    String sparkHome,
                    String[] jars,
                    java.util.Map<String,String> environment)
```
    Create a StreamingContext.
    
    Parameters:
    master - Name of the Spark Master
    appName - Name to be used when registering with the scheduler
    batchDuration - The time interval at which streaming data will be divided into batches
    sparkHome - The SPARK_HOME directory on the slave nodes
    jars - Collection of JARs to send to the cluster. These can be paths on the local file system or HDFS, HTTP, HTTPS, or FTP URLs.
    environment - Environment variables to set on worker nodes
  - JavaStreamingContext
```
public JavaStreamingContext(JavaSparkContext sparkContext,
                    Duration batchDuration)
```
    Create a JavaStreamingContext using an existing JavaSparkContext.
    
    Parameters:
    sparkContext - The underlying JavaSparkContext to use
    batchDuration - The time interval at which streaming data will be divided into batches
  - JavaStreamingContext
```
public JavaStreamingContext(SparkConf conf,
                    Duration batchDuration)
```
    Create a JavaStreamingContext using a SparkConf configuration.
    
    Parameters:
    conf - A Spark application configuration
    batchDuration - The time interval at which streaming data will be divided into batches
  - JavaStreamingContext
```
public JavaStreamingContext(String path)
```
    Recreate a JavaStreamingContext from a checkpoint file.
    
    Parameters:
    path - Path to the directory that was specified as the checkpoint directory
  - JavaStreamingContext
```
public JavaStreamingContext(String path,
                    org.apache.hadoop.conf.Configuration hadoopConf)
```
    Re-creates a JavaStreamingContext from a checkpoint file.
    
    Parameters:
    path - Path to the directory that was specified as the checkpoint directory
- Method Detail
  - getOrCreate
```
public static JavaStreamingContext getOrCreate(String checkpointPath,
                               JavaStreamingContextFactory factory)
```
    Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. If checkpoint data exists in the provided checkpointPath, then StreamingContext will be recreated from the checkpoint data. If the data does not exist, then the provided factory will be used to create a JavaStreamingContext.
    
    Parameters:
    checkpointPath - Checkpoint directory used in an earlier JavaStreamingContext program
    factory - JavaStreamingContextFactory object to create a new JavaStreamingContext
  - getOrCreate
```
public static JavaStreamingContext getOrCreate(String checkpointPath,
                               org.apache.hadoop.conf.Configuration hadoopConf,
                               JavaStreamingContextFactory factory)
```
    Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. If checkpoint data exists in the provided checkpointPath, then StreamingContext will be recreated from the checkpoint data. If the data does not exist, then the provided factory will be used to create a JavaStreamingContext.
    
    Parameters:
    checkpointPath - Checkpoint directory used in an earlier StreamingContext program
    factory - JavaStreamingContextFactory object to create a new JavaStreamingContext
    hadoopConf - Hadoop configuration if necessary for reading from any HDFS compatible file system
  - getOrCreate
```
public static JavaStreamingContext getOrCreate(String checkpointPath,
                               org.apache.hadoop.conf.Configuration hadoopConf,
                               JavaStreamingContextFactory factory,
                               boolean createOnError)
```
    Either recreate a StreamingContext from checkpoint data or create a new StreamingContext. If checkpoint data exists in the provided checkpointPath, then StreamingContext will be recreated from the checkpoint data. If the data does not exist, then the provided factory will be used to create a JavaStreamingContext.
    
    Parameters:
    checkpointPath - Checkpoint directory used in an earlier StreamingContext program
    factory - JavaStreamingContextFactory object to create a new JavaStreamingContext
    hadoopConf - Hadoop configuration if necessary for reading from any HDFS compatible file system
    createOnError - Whether to create a new JavaStreamingContext if there is an error in reading checkpoint data.
  - jarOfClass
```
public static String[] jarOfClass(Class<?> cls)
```
    Find the JAR from which a given class was loaded, to make it easy for users to pass their JARs to StreamingContext.
  - ssc
```
public StreamingContext ssc()
```
  - sparkContext
```
public JavaSparkContext sparkContext()
```
    The underlying SparkContext
  - sc
```
public JavaSparkContext sc()
```
  - socketTextStream
```
public JavaReceiverInputDStream<String> socketTextStream(String hostname,
                                                int port,
                                                StorageLevel storageLevel)
```
    Create an input stream from network source hostname:port. Data is received using a TCP socket and the receive bytes is interpreted as UTF8 encoded \n delimited lines.
    
    Parameters:
    hostname - Hostname to connect to for receiving data
    port - Port to connect to for receiving data
    storageLevel - Storage level to use for storing the received objects
  - socketTextStream
```
public JavaReceiverInputDStream<String> socketTextStream(String hostname,
                                                int port)
```
    Create an input stream from network source hostname:port. Data is received using a TCP socket and the receive bytes is interpreted as UTF8 encoded \n delimited lines. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
    
    Parameters:
    hostname - Hostname to connect to for receiving data
    port - Port to connect to for receiving data
  - socketStream
```
public <T> JavaReceiverInputDStream<T> socketStream(String hostname,
                                           int port,
                                           Function<java.io.InputStream,Iterable<T>> converter,
                                           StorageLevel storageLevel)
```
    Create an input stream from network source hostname:port. Data is received using a TCP socket and the receive bytes it interepreted as object using the given converter.
    
    Parameters:
    hostname - Hostname to connect to for receiving data
    port - Port to connect to for receiving data
    converter - Function to convert the byte stream to objects
    storageLevel - Storage level to use for storing the received objects
  - textFileStream
```
public JavaDStream<String> textFileStream(String directory)
```
    Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as text files (using key as LongWritable, value as Text and input format as TextInputFormat). Files must be written to the monitored directory by "moving" them from another location within the same file system. File names starting with . are ignored.
    
    Parameters:
    directory - HDFS directory to monitor for new file
  - rawSocketStream
```
public <T> JavaReceiverInputDStream<T> rawSocketStream(String hostname,
                                              int port,
                                              StorageLevel storageLevel)
```
    Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them. This is the most efficient way to receive data.
    
    Parameters:
    hostname - Hostname to connect to for receiving data
    port - Port to connect to for receiving data
    storageLevel - Storage level to use for storing the received objects
  - rawSocketStream
```
public <T> JavaReceiverInputDStream<T> rawSocketStream(String hostname,
                                              int port)
```
    Create an input stream from network source hostname:port, where data is received as serialized blocks (serialized using the Spark's serializer) that can be directly pushed into the block manager without deserializing them. This is the most efficient way to receive data.
    
    Parameters:
    hostname - Hostname to connect to for receiving data
    port - Port to connect to for receiving data
  - fileStream
```
public <K,V,F extends org.apache.hadoop.mapreduce.InputFormat<K,V>> JavaPairInputDStream<K,V> fileStream(String directory)
```
    Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them using the given key-value types and input format. Files must be written to the monitored directory by "moving" them from another location within the same file system. File names starting with . are ignored.
    
    Parameters:
    directory - HDFS directory to monitor for new file
  - actorStream
```
public <T> JavaReceiverInputDStream<T> actorStream(akka.actor.Props props,
                                          String name,
                                          StorageLevel storageLevel,
                                          akka.actor.SupervisorStrategy supervisorStrategy)
```
    Create an input stream with any arbitrary user implemented actor receiver.
    
    Parameters:
    props - Props object defining creation of the actor
    name - Name of the actor
    storageLevel - Storage level to use for storing the received objects
  - actorStream
```
public <T> JavaReceiverInputDStream<T> actorStream(akka.actor.Props props,
                                          String name,
                                          StorageLevel storageLevel)
```
    Create an input stream with any arbitrary user implemented actor receiver.
    
    Parameters:
    props - Props object defining creation of the actor
    name - Name of the actor
    storageLevel - Storage level to use for storing the received objects
  - actorStream
```
public <T> JavaReceiverInputDStream<T> actorStream(akka.actor.Props props,
                                          String name)
```
    Create an input stream with any arbitrary user implemented actor receiver. Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
    
    Parameters:
    props - Props object defining creation of the actor
    name - Name of the actor
  - queueStream
```
public <T> JavaDStream<T> queueStream(java.util.Queue<JavaRDD<T>> queue)
```
    Create an input stream from an queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.
    NOTE: changes to the queue after the stream is created will not be recognized.
    
    Parameters:
    queue - Queue of RDDs
  - queueStream
```
public <T> JavaInputDStream<T> queueStream(java.util.Queue<JavaRDD<T>> queue,
                                  boolean oneAtATime)
```
    Create an input stream from an queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.
    NOTE: changes to the queue after the stream is created will not be recognized.
    
    Parameters:
    queue - Queue of RDDs
    oneAtATime - Whether only one RDD should be consumed from the queue in every interval
  - queueStream
```
public <T> JavaInputDStream<T> queueStream(java.util.Queue<JavaRDD<T>> queue,
                                  boolean oneAtATime,
                                  JavaRDD<T> defaultRDD)
```
    Create an input stream from an queue of RDDs. In each batch, it will process either one or all of the RDDs returned by the queue.
    NOTE: changes to the queue after the stream is created will not be recognized.
    
    Parameters:
    queue - Queue of RDDs
    oneAtATime - Whether only one RDD should be consumed from the queue in every interval
    defaultRDD - Default RDD is returned by the DStream when the queue is empty
  - receiverStream
```
public <T> JavaReceiverInputDStream<T> receiverStream(Receiver<T> receiver)
```
    Create an input stream with any arbitrary user implemented receiver. Find more details at: http://spark.apache.org/docs/latest/streaming-custom-receivers.html
    
    Parameters:
    receiver - Custom implementation of Receiver
  - union
```
public <T> JavaDStream<T> union(JavaDStream<T> first,
                       java.util.List<JavaDStream<T>> rest)
```
    Create a unified DStream from multiple DStreams of the same type and same slide duration.
  - union
```
public <K,V> JavaPairDStream<K,V> union(JavaPairDStream<K,V> first,
                               java.util.List<JavaPairDStream<K,V>> rest)
```
    Create a unified DStream from multiple DStreams of the same type and same slide duration.
  - transform
```
public <T> JavaDStream<T> transform(java.util.List<JavaDStream<?>> dstreams,
                           Function2<java.util.List<JavaRDD<?>>,Time,JavaRDD<T>> transformFunc)
```
    Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams. The order of the JavaRDDs in the transform function parameter will be the same as the order of corresponding DStreams in the list. Note that for adding a JavaPairDStream in the list of JavaDStreams, convert it to a JavaDStream using JavaPairDStream.toJavaDStream(). In the transform function, convert the JavaRDD corresponding to that JavaDStream to a JavaPairRDD using org.apache.spark.api.java.JavaPairRDD.fromJavaRDD().
  - transformToPair
```
public <K,V> JavaPairDStream<K,V> transformToPair(java.util.List<JavaDStream<?>> dstreams,
                                         Function2<java.util.List<JavaRDD<?>>,Time,JavaPairRDD<K,V>> transformFunc)
```
    Create a new DStream in which each RDD is generated by applying a function on RDDs of the DStreams. The order of the JavaRDDs in the transform function parameter will be the same as the order of corresponding DStreams in the list. Note that for adding a JavaPairDStream in the list of JavaDStreams, convert it to a JavaDStream using JavaPairDStream.toJavaDStream(). In the transform function, convert the JavaRDD corresponding to that JavaDStream to a JavaPairRDD using org.apache.spark.api.java.JavaPairRDD.fromJavaRDD().
  - checkpoint
```
public void checkpoint(String directory)
```
    Sets the context to periodically checkpoint the DStream operations for master fault-tolerance. The graph will be checkpointed every batch interval.
    
    Parameters:
    directory - HDFS-compatible directory where the checkpoint data will be reliably stored
  - remember
```
public void remember(Duration duration)
```
    Sets each DStreams in this context to remember RDDs it generated in the last given duration. DStreams remember RDDs only for a limited duration of duration and releases them for garbage collection. This method allows the developer to specify how long to remember the RDDs ( if the developer wishes to query old data outside the DStream computation).
    
    Parameters:
    duration - Minimum duration that each DStream should remember its RDDs
  - addStreamingListener
```
public void addStreamingListener(StreamingListener streamingListener)
```
    Add a StreamingListener object for receiving system events related to streaming.
  - start
```
public void start()
```
    Start the execution of the streams.
  - awaitTermination
```
public void awaitTermination()
```
    Wait for the execution to stop. Any exceptions that occurs during the execution will be thrown in this thread.
  - awaitTermination
```
public void awaitTermination(long timeout)
```
    Wait for the execution to stop. Any exceptions that occurs during the execution will be thrown in this thread.
    
    Parameters:
    timeout - time to wait in milliseconds
  - stop
```
public void stop()
```
    Stop the execution of the streams. Will stop the associated JavaSparkContext as well.
  - stop
```
public void stop(boolean stopSparkContext)
```
    Stop the execution of the streams.
    
    Parameters:
    stopSparkContext - Stop the associated SparkContext or not
  - stop
```
public void stop(boolean stopSparkContext,
        boolean stopGracefully)
```
    Stop the execution of the streams.
    
    Parameters:
    stopSparkContext - Stop the associated SparkContext or not
    stopGracefully - Stop gracefully by waiting for the processing of all received data to be completed
  - close
```
public void close()
```
    Specified by:
    
    close in interface java.io.Closeable
    
    Specified by:
    
    close in interface AutoCloseable

Class JavaStreamingContext

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Detail

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

JavaStreamingContext

Method Detail

getOrCreate

getOrCreate

getOrCreate

jarOfClass

ssc

sparkContext

sc

socketTextStream

socketTextStream

socketStream

textFileStream

rawSocketStream

rawSocketStream

fileStream

actorStream

actorStream

actorStream

queueStream

queueStream

queueStream

receiverStream

union

union

transform

transformToPair

checkpoint

remember

addStreamingListener

start

awaitTermination

awaitTermination

stop

stop

stop

close