pyspark.sql.DataFrameWriter¶
- 
class pyspark.sql.DataFrameWriter(df: DataFrame)[source]¶
- Interface used to write a - DataFrameto external storage systems (e.g. file systems, key-value stores, etc). Use- DataFrame.writeto access this.- New in version 1.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Methods - bucketBy(numBuckets, col, *cols)- Buckets the output by the given columns. - csv(path[, mode, compression, sep, quote, …])- Saves the content of the - DataFramein CSV format at the specified path.- format(source)- Specifies the underlying output data source. - insertInto(tableName[, overwrite])- Inserts the content of the - DataFrameto the specified table.- jdbc(url, table[, mode, properties])- Saves the content of the - DataFrameto an external database table via JDBC.- json(path[, mode, compression, dateFormat, …])- Saves the content of the - DataFramein JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.- mode(saveMode)- Specifies the behavior when data or table already exists. - option(key, value)- Adds an output option for the underlying data source. - options(**options)- Adds output options for the underlying data source. - orc(path[, mode, partitionBy, compression])- Saves the content of the - DataFramein ORC format at the specified path.- parquet(path[, mode, partitionBy, compression])- Saves the content of the - DataFramein Parquet format at the specified path.- partitionBy(*cols)- Partitions the output by the given columns on the file system. - save([path, format, mode, partitionBy])- Saves the contents of the - DataFrameto a data source.- saveAsTable(name[, format, mode, partitionBy])- Saves the content of the - DataFrameas the specified table.- sortBy(col, *cols)- Sorts the output in each bucket by the given columns on the file system. - text(path[, compression, lineSep])- Saves the content of the DataFrame in a text file at the specified path.