write.parquet {SparkR}R Documentation

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

Description

Save the contents of a SparkDataFrame as a Parquet file, preserving the schema. Files written out with this method can be read back in as a SparkDataFrame using read.parquet().

Usage

write.parquet(x, path, ...)

## S4 method for signature 'SparkDataFrame,character'
write.parquet(x, path, mode = "error", ...)

Arguments

x

A SparkDataFrame

path

The directory where the file is saved

...

additional argument(s) passed to the method. You can find the Parquet-specific options for writing Parquet files in https://spark.apache.org/docs/latest/sql-data-sources-parquet.html#data-source-optionData Source Option in the version you use.

mode

one of 'append', 'overwrite', 'error', 'errorifexists', 'ignore' save mode (it is 'error' by default)

Note

write.parquet since 1.6.0

See Also

Other SparkDataFrame functions: SparkDataFrame-class, agg(), alias(), arrange(), as.data.frame(), attach,SparkDataFrame-method, broadcast(), cache(), checkpoint(), coalesce(), collect(), colnames(), coltypes(), createOrReplaceTempView(), crossJoin(), cube(), dapplyCollect(), dapply(), describe(), dim(), distinct(), dropDuplicates(), dropna(), drop(), dtypes(), exceptAll(), except(), explain(), filter(), first(), gapplyCollect(), gapply(), getNumPartitions(), group_by(), head(), hint(), histogram(), insertInto(), intersectAll(), intersect(), isLocal(), isStreaming(), join(), limit(), localCheckpoint(), merge(), mutate(), ncol(), nrow(), persist(), printSchema(), randomSplit(), rbind(), rename(), repartitionByRange(), repartition(), rollup(), sample(), saveAsTable(), schema(), selectExpr(), select(), showDF(), show(), storageLevel(), str(), subset(), summary(), take(), toJSON(), unionAll(), unionByName(), union(), unpersist(), withColumn(), withWatermark(), with(), write.df(), write.jdbc(), write.json(), write.orc(), write.stream(), write.text()

Examples

## Not run: 
##D sparkR.session()
##D path <- "path/to/file.json"
##D df <- read.json(path)
##D write.parquet(df, "/tmp/sparkr-tmp1/")
## End(Not run)

[Package SparkR version 3.2.0 Index]