pyspark.sql.SparkSession.readStream¶
-
property
SparkSession.
readStream
¶ Returns a
DataStreamReader
that can be used to read data streams as a streamingDataFrame
.New in version 2.0.0.
Changed in version 3.5.0: Supports Spark Connect.
- Returns
DataStreamReader
Notes
This API is evolving.
Examples
>>> spark.readStream <pyspark...DataStreamReader object ...>
The example below uses Rate source that generates rows continuously. After that, we operate a modulo by 3, and then write the stream out to the console. The streaming query stops in 3 seconds.
>>> import time >>> df = spark.readStream.format("rate").load() >>> df = df.selectExpr("value % 3 as v") >>> q = df.writeStream.format("console").start() >>> time.sleep(3) >>> q.stop()