Interface MicroBatchStream
- All Superinterfaces:
SparkDataStream
A
SparkDataStream
for streaming queries with micro-batch mode.- Since:
- 3.0.0
-
Method Summary
Modifier and TypeMethodDescriptionReturns a factory to create aPartitionReader
for eachInputPartition
.Returns the most recent offset available.planInputPartitions
(Offset start, Offset end) Returns a list ofinput partitions
given the start and end offsets.Methods inherited from interface org.apache.spark.sql.connector.read.streaming.SparkDataStream
commit, deserializeOffset, initialOffset, stop
-
Method Details
-
latestOffset
Offset latestOffset()Returns the most recent offset available. -
planInputPartitions
Returns a list ofinput partitions
given the start and end offsets. EachInputPartition
represents a data split that can be processed by one Spark task. The number of input partitions returned here is the same as the number of RDD partitions this scan outputs.If the
Scan
supports filter pushdown, this stream is likely configured with a filter and is responsible for creating splits for that filter, which is not a full scan.This method will be called multiple times, to launch one Spark job for each micro-batch in this data stream.
-
createReaderFactory
PartitionReaderFactory createReaderFactory()Returns a factory to create aPartitionReader
for eachInputPartition
.
-