pyspark.sql.streaming.DataStreamWriter.trigger#

DataStreamWriter.trigger(*, processingTime=None, once=None, continuous=None, availableNow=None)[source]#

Set the trigger for the stream query. If this is not set it will run the query as fast as possible, which is equivalent to setting the trigger to processingTime='0 seconds'.

New in version 2.0.0.

Changed in version 3.5.0: Supports Spark Connect.

Parameters

processingTimestr, optional: a processing time interval as a string, e.g. ‘5 seconds’, ‘1 minute’. Set a trigger that runs a microbatch query periodically based on the processing time. Only one trigger can be set.
oncebool, optional: if set to True, set a trigger that processes only one batch of data in a streaming query then terminates the query. Only one trigger can be set.
continuousstr, optional: a time interval as a string, e.g. ‘5 seconds’, ‘1 minute’. Set a trigger that runs a continuous query with a given checkpoint interval. Only one trigger can be set.
availableNowbool, optional: if set to True, set a trigger that processes all available data in multiple batches then terminates the query. Only one trigger can be set.

Notes

This API is evolving.

Examples

>>> df = spark.readStream.format("rate").load()

Trigger the query for execution every 5 seconds

>>> df.writeStream.trigger(processingTime='5 seconds')
<...streaming.readwriter.DataStreamWriter object ...>

Trigger the query for execution every 5 seconds

>>> df.writeStream.trigger(continuous='5 seconds')
<...streaming.readwriter.DataStreamWriter object ...>

Trigger the query for reading all available data with multiple batches

>>> df.writeStream.trigger(availableNow=True)
<...streaming.readwriter.DataStreamWriter object ...>