package streaming
Type Members
-    trait AcceptsLatestSeenOffset extends SparkDataStreamIndicates that the source accepts the latest seen offset, which requires streaming execution to provide the latest seen offset when restarting the streaming query from checkpoint. Indicates that the source accepts the latest seen offset, which requires streaming execution to provide the latest seen offset when restarting the streaming query from checkpoint. Note that this interface aims to only support DSv2 streaming sources. Spark may throw error if the interface is implemented along with DSv1 streaming sources. The callback method will be called once per run. - Since
- 3.3.0 
 
-   final  class CompositeReadLimit extends ReadLimit/** Represents a ReadLimitwhere theMicroBatchStreamshould scan approximately given maximum number of rows with at least the given minimum number of rows./** Represents a ReadLimitwhere theMicroBatchStreamshould scan approximately given maximum number of rows with at least the given minimum number of rows.- Annotations
- @Evolving()
- Since
- 3.2.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-    trait ContinuousPartitionReader[T] extends PartitionReader[T]A variation on PartitionReaderfor use with continuous streaming processing.A variation on PartitionReaderfor use with continuous streaming processing.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait ContinuousPartitionReaderFactory extends PartitionReaderFactoryA variation on PartitionReaderFactorythat returnsContinuousPartitionReaderinstead ofPartitionReader.A variation on PartitionReaderFactorythat returnsContinuousPartitionReaderinstead ofPartitionReader. It's used for continuous streaming processing.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait ContinuousStream extends SparkDataStreamA SparkDataStreamfor streaming queries with continuous mode.A SparkDataStreamfor streaming queries with continuous mode.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait MicroBatchStream extends SparkDataStreamA SparkDataStreamfor streaming queries with micro-batch mode.A SparkDataStreamfor streaming queries with micro-batch mode.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-   abstract  class Offset extends AnyRefAn abstract representation of progress through a MicroBatchStreamorContinuousStream.An abstract representation of progress through a MicroBatchStreamorContinuousStream.During execution, offsets provided by the data source implementation will be logged and used as restart checkpoints. Each source should provide an offset implementation which the source can use to reconstruct a position in the stream up to which data has been seen/processed. - Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait PartitionOffset extends SerializableUsed for per-partition offsets in continuous processing. Used for per-partition offsets in continuous processing. ContinuousReader implementations will provide a method to merge these into a global Offset. These offsets must be serializable. - Annotations
- @Evolving()
- Since
- 3.0.0 
 
-   final  class ReadAllAvailable extends ReadLimitRepresents a ReadLimitwhere theMicroBatchStreammust scan all the data available at the streaming source.Represents a ReadLimitwhere theMicroBatchStreammust scan all the data available at the streaming source. This is meant to be a hard specification as being able to return all available data is necessary forTrigger.Once()to work correctly. If a source is unable to scan all available data, then it must throw an error.- Annotations
- @Evolving()
- Since
- 3.0.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-    trait ReadLimit extends AnyRefInterface representing limits on how much to read from a MicroBatchStreamwhen it implementsSupportsAdmissionControl.Interface representing limits on how much to read from a MicroBatchStreamwhen it implementsSupportsAdmissionControl. There are several child interfaces representing various kinds of limits.- Annotations
- @Evolving()
- Since
- 3.0.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) - ReadAllAvailable - ReadMaxRows 
 
-    class ReadMaxBytes extends ReadLimitRepresents a ReadLimitwhere theMicroBatchStreamshould scan files which total size doesn't go beyond a given maximum total size.Represents a ReadLimitwhere theMicroBatchStreamshould scan files which total size doesn't go beyond a given maximum total size. Always reads at least one file so a stream can make progress and not get stuck on a file larger than a given maximum.- Annotations
- @Evolving()
- Since
- 4.0.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-    class ReadMaxFiles extends ReadLimitRepresents a ReadLimitwhere theMicroBatchStreamshould scan approximately the given maximum number of files.Represents a ReadLimitwhere theMicroBatchStreamshould scan approximately the given maximum number of files.- Annotations
- @Evolving()
- Since
- 3.0.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-   final  class ReadMaxRows extends ReadLimitRepresents a ReadLimitwhere theMicroBatchStreamshould scan approximately the given maximum number of rows.Represents a ReadLimitwhere theMicroBatchStreamshould scan approximately the given maximum number of rows.- Annotations
- @Evolving()
- Since
- 3.0.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-   final  class ReadMinRows extends ReadLimitRepresents a ReadLimitwhere theMicroBatchStreamshould scan approximately at least the given minimum number of rows.Represents a ReadLimitwhere theMicroBatchStreamshould scan approximately at least the given minimum number of rows.- Annotations
- @Evolving()
- Since
- 3.2.0 
- See also
- SupportsAdmissionControl#latestOffset(Offset, ReadLimit) 
 
-    trait ReportsSinkMetrics extends AnyRefA mix-in interface for streaming sinks to signal that they can report metrics. A mix-in interface for streaming sinks to signal that they can report metrics. - Annotations
- @Evolving()
- Since
- 3.4.0 
 
-    trait ReportsSourceMetrics extends SparkDataStreamA mix-in interface for SparkDataStreamstreaming sources to signal that they can report metrics.A mix-in interface for SparkDataStreamstreaming sources to signal that they can report metrics.- Annotations
- @Evolving()
- Since
- 3.2.0 
 
-    trait SparkDataStream extends AnyRefThe base interface representing a readable data stream in a Spark streaming query. The base interface representing a readable data stream in a Spark streaming query. It's responsible to manage the offsets of the streaming source in the streaming query. Data sources should implement concrete data stream interfaces: MicroBatchStreamandContinuousStream.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait SupportsAdmissionControl extends SparkDataStreamA mix-in interface for SparkDataStreamstreaming sources to signal that they can control the rate of data ingested into the system.A mix-in interface for SparkDataStreamstreaming sources to signal that they can control the rate of data ingested into the system. These rate limits can come implicitly from the contract of triggers, e.g. Trigger.Once() requires that a micro-batch process all data available to the system at the start of the micro-batch. Alternatively, sources can decide to limit ingest through data source options.Through this interface, a MicroBatchStream should be able to return the next offset that it will process until given a ReadLimit.- Annotations
- @Evolving()
- Since
- 3.0.0 
 
-    trait SupportsRealTimeMode extends AnyRefA MicroBatchStreamfor streaming queries with real time mode.A MicroBatchStreamfor streaming queries with real time mode.- Annotations
- @Evolving()
 
-    trait SupportsRealTimeRead[T] extends PartitionReader[T]A variation on PartitionReaderfor use with low latency streaming processing.A variation on PartitionReaderfor use with low latency streaming processing.- Annotations
- @Evolving()
 
-    trait SupportsTriggerAvailableNow extends SupportsAdmissionControlAn interface for streaming sources that supports running in Trigger.AvailableNow mode, which will process all the available data at the beginning of the query in (possibly) multiple batches. An interface for streaming sources that supports running in Trigger.AvailableNow mode, which will process all the available data at the beginning of the query in (possibly) multiple batches. This mode will have better scalability comparing to Trigger.Once mode. - Annotations
- @Evolving()
- Since
- 3.3.0