pyspark.sql.streaming.DataStreamReader.load¶
- 
DataStreamReader.load(path: Optional[str] = None, format: Optional[str] = None, schema: Union[pyspark.sql.types.StructType, str, None] = None, **options: OptionalPrimitiveType) → DataFrame[source]¶
- Loads a data stream from a data source and returns it as a - DataFrame.- New in version 2.0.0. - Changed in version 3.5.0: Supports Spark Connect. - Parameters
- pathstr, optional
- optional string for file-system backed data sources. 
- formatstr, optional
- optional string for format of the data source. Default to ‘parquet’. 
- schemapyspark.sql.types.StructTypeor str, optional
- optional - pyspark.sql.types.StructTypefor the input schema or a DDL-formatted string (For example- col0 INT, col1 DOUBLE).
- **optionsdict
- all other string options 
 
 - Notes - This API is evolving. - Examples - Load a data stream from a temporary JSON file. - >>> import tempfile >>> import time >>> with tempfile.TemporaryDirectory() as d: ... # Write a temporary JSON file to read it. ... spark.createDataFrame( ... [(100, "Hyukjin Kwon"),], ["age", "name"] ... ).write.mode("overwrite").format("json").save(d) ... ... # Start a streaming query to read the JSON file. ... q = spark.readStream.schema( ... "age INT, name STRING" ... ).format("json").load(d).writeStream.format("console").start() ... time.sleep(3) ... q.stop()