Streams the contents of the DataFrame to a data source.
The data source is specified by the format and a set of options.
If format is not specified, the default data source configured by
spark.sql.sources.default will be used.
New in version 2.0.0.
the path in a Hadoop supported file system
the format used to save
specifies how data of a streaming DataFrame/Dataset is written to a
append: Only the new rows in the streaming DataFrame/Dataset will be written to the
complete: All the rows in the streaming DataFrame/Dataset will be written to the
sink every time these are some updates
update: only the rows that were updated in the streaming DataFrame/Dataset will be
written to the sink every time there are some updates. If the query doesn’t contain
aggregations, it will be equivalent to append mode.
names of partitioning columns
unique name for the query
All other string options. You may want to provide a checkpointLocation
for most streams, however it is not required for a memory stream.
This API is evolving.
>>> df = spark.readStream.format("rate").load()
>>> q = df.writeStream.format('memory').queryName('this_query').start()
Example with using other parameters with a trigger.
>>> q = df.writeStream.trigger(processingTime='5 seconds').start(
... queryName='that_query', outputMode="append", format='memory')