pyspark.sql.DataFrameWriter.mode

DataFrameWriter.mode(saveMode: Optional[str]) → pyspark.sql.readwriter.DataFrameWriter[source]

Specifies the behavior when data or table already exists.

Options include:

  • append: Append contents of this DataFrame to existing data.

  • overwrite: Overwrite existing data.

  • error or errorifexists: Throw an exception if data already exists.

  • ignore: Silently ignore this operation if data already exists.

New in version 1.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Examples

Raise an error when writing to an existing path.

>>> import tempfile
>>> with tempfile.TemporaryDirectory() as d:
...     spark.createDataFrame(
...         [{"age": 80, "name": "Xinrong Meng"}]
...     ).write.mode("error").format("parquet").save(d) 
Traceback (most recent call last):
    ...
...AnalysisException: ...

Write a Parquet file back with various options, and read it back.

>>> with tempfile.TemporaryDirectory() as d:
...     # Overwrite the path with a new Parquet file
...     spark.createDataFrame(
...         [{"age": 100, "name": "Hyukjin Kwon"}]
...     ).write.mode("overwrite").format("parquet").save(d)
...
...     # Append another DataFrame into the Parquet file
...     spark.createDataFrame(
...         [{"age": 120, "name": "Takuya Ueshin"}]
...     ).write.mode("append").format("parquet").save(d)
...
...     # Append another DataFrame into the Parquet file
...     spark.createDataFrame(
...         [{"age": 140, "name": "Haejoon Lee"}]
...     ).write.mode("ignore").format("parquet").save(d)
...
...     # Read the Parquet file as a DataFrame.
...     spark.read.parquet(d).show()
+---+-------------+
|age|         name|
+---+-------------+
|120|Takuya Ueshin|
|100| Hyukjin Kwon|
+---+-------------+