MicroBatchStream (Spark 3.5.3 JavaDoc)

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

All Superinterfaces:

SparkDataStream
```
@Evolving
public interface MicroBatchStream
extends SparkDataStream
```
A SparkDataStream for streaming queries with micro-batch mode.

Since:

3.0.0

Method Summary

All Methods Instance Methods Abstract Methods
Modifier and Type	Method and Description
`PartitionReaderFactory`	`createReaderFactory()` Returns a factory to create a `PartitionReader` for each `InputPartition`.
`Offset`	`latestOffset()` Returns the most recent offset available.
`InputPartition[]`	`planInputPartitions(Offset start, Offset end)` Returns a list of `input partitions` given the start and end offsets.

Methods inherited from interface org.apache.spark.sql.connector.read.streaming.SparkDataStream
commit, deserializeOffset, initialOffset, stop

- Method Detail
  - latestOffset
```
Offset latestOffset()
```
    Returns the most recent offset available.
  - planInputPartitions
```
InputPartition[] planInputPartitions(Offset start,
                                     Offset end)
```
    Returns a list of input partitions given the start and end offsets. Each InputPartition represents a data split that can be processed by one Spark task. The number of input partitions returned here is the same as the number of RDD partitions this scan outputs.
    If the Scan supports filter pushdown, this stream is likely configured with a filter and is responsible for creating splits for that filter, which is not a full scan.
    
    This method will be called multiple times, to launch one Spark job for each micro-batch in this data stream.
  - createReaderFactory
```
PartitionReaderFactory createReaderFactory()
```
    Returns a factory to create a PartitionReader for each InputPartition.

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method