Load a streaming SparkDataFrame
read.stream.Rd
Returns the dataset in a data source as a SparkDataFrame
Arguments
- source
The name of external data source
- schema
The data schema defined in structType or a DDL-formatted string, this is required for file-based streaming data source
- ...
additional external data source specific named options, for instance
path
for file-based streaming data source.timeZone
to indicate a timezone to be used to parse timestamps in the JSON/CSV data sources or partition values; If it isn't set, it uses the default value, session local timezone.
Details
The data source is specified by the source
and a set of options(...).
If source
is not specified, the default data source configured by
"spark.sql.sources.default" will be used.
Examples
if (FALSE) {
sparkR.session()
df <- read.stream("socket", host = "localhost", port = 9999)
q <- write.stream(df, "text", path = "/home/user/out", checkpointLocation = "/home/user/cp")
df <- read.stream("json", path = jsonDir, schema = schema, maxFilesPerTrigger = 1)
stringSchema <- "name STRING, info MAP<STRING, DOUBLE>"
df1 <- read.stream("json", path = jsonDir, schema = stringSchema, maxFilesPerTrigger = 1)
}