trait ContinuousStream extends SparkDataStream
A SparkDataStream
for streaming queries with continuous mode.
- Annotations
- @Evolving()
- Source
- ContinuousStream.java
- Since
3.0.0
- Alphabetic
- By Inheritance
- ContinuousStream
- SparkDataStream
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def commit(end: Offset): Unit
Informs the source that Spark has completed processing all data for offsets less than or equal to
end
and will only request offsets greater thanend
in the future.Informs the source that Spark has completed processing all data for offsets less than or equal to
end
and will only request offsets greater thanend
in the future.- Definition Classes
- SparkDataStream
- abstract def createContinuousReaderFactory(): ContinuousPartitionReaderFactory
Returns a factory to create a
ContinuousPartitionReader
for eachInputPartition
. - abstract def deserializeOffset(json: String): Offset
Deserialize a JSON string into an Offset of the implementation-defined offset type.
Deserialize a JSON string into an Offset of the implementation-defined offset type.
- Definition Classes
- SparkDataStream
- Exceptions thrown
IllegalArgumentException
if the JSON does not encode a valid offset for this reader
- abstract def initialOffset(): Offset
Returns the initial offset for a streaming query to start reading from.
Returns the initial offset for a streaming query to start reading from. Note that the streaming data source should not assume that it will start reading from its initial offset: if Spark is restarting an existing query, it will restart from the check-pointed offset rather than the initial one.
- Definition Classes
- SparkDataStream
- abstract def mergeOffsets(offsets: Array[PartitionOffset]): Offset
Merge partitioned offsets coming from
ContinuousPartitionReader
instances for each partition to a single global offset. - abstract def planInputPartitions(start: Offset): Array[InputPartition]
Returns a list of
input partitions
given the start offset.Returns a list of
input partitions
given the start offset. EachInputPartition
represents a data split that can be processed by one Spark task. The number of input partitions returned here is the same as the number of RDD partitions this scan outputs.If the
Scan
supports filter pushdown, this stream is likely configured with a filter and is responsible for creating splits for that filter, which is not a full scan.This method will be called to launch one Spark job for reading the data stream. It will be called more than once, if
#needsReconfiguration()
returns true and Spark needs to launch a new job. - abstract def stop(): Unit
Stop this source and free any resources it has allocated.
Stop this source and free any resources it has allocated.
- Definition Classes
- SparkDataStream
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def needsReconfiguration(): Boolean
The execution engine will call this method in every epoch to determine if new input partitions need to be generated, which may be required if for example the underlying source system has had partitions added or removed.
The execution engine will call this method in every epoch to determine if new input partitions need to be generated, which may be required if for example the underlying source system has had partitions added or removed.
If true, the Spark job to scan this continuous data stream will be interrupted and Spark will launch it again with a new list of
input partitions
. - final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)