Class JavaDStream<T>

Object
org.apache.spark.streaming.api.java.JavaDStream<T>
All Implemented Interfaces:
Serializable, JavaDStreamLike<T,JavaDStream<T>,JavaRDD<T>>, scala.Serializable
Direct Known Subclasses:
JavaInputDStream, JavaMapWithStateDStream

public class JavaDStream<T> extends Object
A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data. DStreams can either be created from live data (such as, data from TCP sockets, Kafka, etc.) or it can be generated by transforming existing DStreams using operations such as map, window. For operations applicable to key-value pair DStreams, see JavaPairDStream.
See Also:
  • Constructor Details

    • JavaDStream

      public JavaDStream(DStream<T> dstream, scala.reflect.ClassTag<T> classTag)
  • Method Details

    • fromDStream

      public static <T> JavaDStream<T> fromDStream(DStream<T> dstream, scala.reflect.ClassTag<T> evidence$1)
      Convert a scala DStream to a Java-friendly JavaDStream.
      Parameters:
      dstream - (undocumented)
      evidence$1 - (undocumented)
      Returns:
      (undocumented)
    • dstream

      public DStream<T> dstream()
    • classTag

      public scala.reflect.ClassTag<T> classTag()
    • wrapRDD

      public JavaRDD<T> wrapRDD(RDD<T> rdd)
    • filter

      public JavaDStream<T> filter(Function<T,Boolean> f)
      Return a new DStream containing only the elements that satisfy a predicate.
    • cache

      public JavaDStream<T> cache()
      Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
    • persist

      public JavaDStream<T> persist()
      Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
    • persist

      public JavaDStream<T> persist(StorageLevel storageLevel)
      Persist the RDDs of this DStream with the given storage level
    • compute

      public JavaRDD<T> compute(Time validTime)
      Generate an RDD for the given duration
    • window

      public JavaDStream<T> window(Duration windowDuration)
      Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream. The new DStream generates RDDs with the same interval as this DStream.
      Parameters:
      windowDuration - width of the window; must be a multiple of this DStream's interval.
      Returns:
      (undocumented)
    • window

      public JavaDStream<T> window(Duration windowDuration, Duration slideDuration)
      Return a new DStream in which each RDD contains all the elements in seen in a sliding window of time over this DStream.
      Parameters:
      windowDuration - width of the window; must be a multiple of this DStream's batching interval
      slideDuration - sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval
      Returns:
      (undocumented)
    • union

      public JavaDStream<T> union(JavaDStream<T> that)
      Return a new DStream by unifying data of another DStream with this DStream.
      Parameters:
      that - Another DStream having the same interval (i.e., slideDuration) as this DStream.
      Returns:
      (undocumented)
    • repartition

      public JavaDStream<T> repartition(int numPartitions)
      Return a new DStream with an increased or decreased level of parallelism. Each RDD in the returned DStream has exactly numPartitions partitions.
      Parameters:
      numPartitions - (undocumented)
      Returns:
      (undocumented)