pyspark.sql.DataFrameWriter.saveAsTable#
- DataFrameWriter.saveAsTable(name, format=None, mode=None, partitionBy=None, **options)[source]#
Saves the content of the
DataFrame
as the specified table.In the case the table already exists, behavior of this function depends on the save mode, specified by the mode function (default to throwing an exception). When mode is Overwrite, the schema of the
DataFrame
does not need to be the same as that of the existing table.append: Append contents of this
DataFrame
to existing data.overwrite: Overwrite existing data.
error or errorifexists: Throw an exception if data already exists.
ignore: Silently ignore this operation if data already exists.
New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- namestr
the table name
- formatstr, optional
the format used to save
- modestr, optional
one of append, overwrite, error, errorifexists, ignore (default: error)
- partitionBystr or list
names of partitioning columns
- **optionsdict
all other string options
Notes
When mode is Append, if there is an existing table, we will use the format and options of the existing table. The column order in the schema of the
DataFrame
doesn’t need to be the same as that of the existing table. UnlikeDataFrameWriter.insertInto()
,DataFrameWriter.saveAsTable()
will use the column names to find the correct column positions.Examples
Creates a table from a DataFrame, and read it back.
>>> _ = spark.sql("DROP TABLE IF EXISTS tblA") >>> spark.createDataFrame([ ... (100, "Hyukjin Kwon"), (120, "Hyukjin Kwon"), (140, "Haejoon Lee")], ... schema=["age", "name"] ... ).write.saveAsTable("tblA") >>> spark.read.table("tblA").sort("age").show() +---+------------+ |age| name| +---+------------+ |100|Hyukjin Kwon| |120|Hyukjin Kwon| |140| Haejoon Lee| +---+------------+ >>> _ = spark.sql("DROP TABLE tblA")