org.apache.spark.sql

DataFrameWriter

final class DataFrameWriter extends AnyRef

:: Experimental :: Interface used to write a DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.write to access this.

Annotations
@Experimental()
Since

1.4.0

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. DataFrameWriter
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. def format(source: String): DataFrameWriter

    Specifies the underlying output data source.

    Specifies the underlying output data source. Built-in options include "parquet", "json", etc.

    Since

    1.4.0

  12. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  13. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  14. def insertInto(tableName: String): Unit

    Inserts the content of the DataFrame to the specified table.

    Inserts the content of the DataFrame to the specified table. It requires that the schema of the DataFrame is the same as the schema of the table.

    Because it inserts data to an existing table, format or options will be ignored.

    Since

    1.4.0

  15. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  16. def jdbc(url: String, table: String, connectionProperties: Properties): Unit

    Saves the content of the DataFrame to a external database table via JDBC.

    Saves the content of the DataFrame to a external database table via JDBC. In the case the table already exists in the external database, behavior of this function depends on the save mode, specified by the mode function (default to throwing an exception).

    Don't create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems.

    url

    JDBC database url of the form jdbc:subprotocol:subname

    table

    Name of the table in the external database.

    connectionProperties

    JDBC database connection arguments, a list of arbitrary string tag/value. Normally at least a "user" and "password" property should be included.

  17. def json(path: String): Unit

    Saves the content of the DataFrame in JSON format at the specified path.

    Saves the content of the DataFrame in JSON format at the specified path. This is equivalent to:

    format("json").save(path)
    Since

    1.4.0

  18. def mode(saveMode: String): DataFrameWriter

    Specifies the behavior when data or table already exists.

    Specifies the behavior when data or table already exists. Options include:

    • overwrite: overwrite the existing data.
    • append: append the data.
    • ignore: ignore the operation (i.e. no-op).
    • error: default option, throw an exception at runtime.
    Since

    1.4.0

  19. def mode(saveMode: SaveMode): DataFrameWriter

    Specifies the behavior when data or table already exists.

    Specifies the behavior when data or table already exists. Options include:

    • SaveMode.Overwrite: overwrite the existing data.
    • SaveMode.Append: append the data.
    • SaveMode.Ignore: ignore the operation (i.e. no-op).
    • SaveMode.ErrorIfExists: default option, throw an exception at runtime.
    Since

    1.4.0

  20. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  21. final def notify(): Unit

    Definition Classes
    AnyRef
  22. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  23. def option(key: String, value: String): DataFrameWriter

    Adds an output option for the underlying data source.

    Adds an output option for the underlying data source.

    Since

    1.4.0

  24. def options(options: Map[String, String]): DataFrameWriter

    Adds output options for the underlying data source.

    Adds output options for the underlying data source.

    Since

    1.4.0

  25. def options(options: Map[String, String]): DataFrameWriter

    (Scala-specific) Adds output options for the underlying data source.

    (Scala-specific) Adds output options for the underlying data source.

    Since

    1.4.0

  26. def orc(path: String): Unit

    Saves the content of the DataFrame in ORC format at the specified path.

    Saves the content of the DataFrame in ORC format at the specified path. This is equivalent to:

    format("orc").save(path)
    Since

    1.5.0

    Note

    Currently, this method can only be used together with HiveContext.

  27. def parquet(path: String): Unit

    Saves the content of the DataFrame in Parquet format at the specified path.

    Saves the content of the DataFrame in Parquet format at the specified path. This is equivalent to:

    format("parquet").save(path)
    Since

    1.4.0

  28. def partitionBy(colNames: String*): DataFrameWriter

    Partitions the output by the given columns on the file system.

    Partitions the output by the given columns on the file system. If specified, the output is laid out on the file system similar to Hive's partitioning scheme.

    This is only applicable for Parquet at the moment.

    Annotations
    @varargs()
    Since

    1.4.0

  29. def save(): Unit

    Saves the content of the DataFrame as the specified table.

    Saves the content of the DataFrame as the specified table.

    Since

    1.4.0

  30. def save(path: String): Unit

    Saves the content of the DataFrame at the specified path.

    Saves the content of the DataFrame at the specified path.

    Since

    1.4.0

  31. def saveAsTable(tableName: String): Unit

    Saves the content of the DataFrame as the specified table.

    Saves the content of the DataFrame as the specified table.

    In the case the table already exists, behavior of this function depends on the save mode, specified by the mode function (default to throwing an exception). When mode is Overwrite, the schema of the DataFrame does not need to be the same as that of the existing table. When mode is Append, the schema of the DataFrame need to be the same as that of the existing table, and format or options will be ignored.

    When the DataFrame is created from a non-partitioned HadoopFsRelation with a single input path, and the data source provider can be mapped to an existing Hive builtin SerDe (i.e. ORC and Parquet), the table is persisted in a Hive compatible format, which means other systems like Hive will be able to read this table. Otherwise, the table is persisted in a Spark SQL specific format.

    Since

    1.4.0

  32. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  33. def toString(): String

    Definition Classes
    AnyRef → Any
  34. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped