@Evolving public interface MicroBatchStream extends SparkDataStream
SparkDataStream
for streaming queries with micro-batch mode.Modifier and Type | Method and Description |
---|---|
PartitionReaderFactory |
createReaderFactory()
Returns a factory to create a
PartitionReader for each InputPartition . |
Offset |
latestOffset()
Returns the most recent offset available.
|
InputPartition[] |
planInputPartitions(Offset start,
Offset end)
Returns a list of
input partitions given the start and end offsets. |
commit, deserializeOffset, initialOffset, stop
Offset latestOffset()
InputPartition[] planInputPartitions(Offset start, Offset end)
input partitions
given the start and end offsets. Each
InputPartition
represents a data split that can be processed by one Spark task. The
number of input partitions returned here is the same as the number of RDD partitions this scan
outputs.
If the Scan
supports filter pushdown, this stream is likely configured with a filter
and is responsible for creating splits for that filter, which is not a full scan.
This method will be called multiple times, to launch one Spark job for each micro-batch in this data stream.
PartitionReaderFactory createReaderFactory()
PartitionReader
for each InputPartition
.