Packages

c

org.apache.spark.sql

DataFrameWriterV2

abstract class DataFrameWriterV2[T] extends CreateTableWriter[T]

Interface used to write a org.apache.spark.sql.api.Dataset to external storage using the v2 API.

Annotations
@Experimental()
Source
DataFrameWriterV2.scala
Since

3.0.0

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataFrameWriterV2
  2. CreateTableWriter
  3. WriteConfigMethods
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new DataFrameWriterV2()

Abstract Value Members

  1. abstract def append(): Unit

    Append the contents of the data frame to the output table.

    Append the contents of the data frame to the output table.

    If the output table does not exist, this operation will fail with org.apache.spark.sql.catalyst.analysis.NoSuchTableException. The data frame will be validated to ensure it is compatible with the existing table.

    Annotations
    @throws(classOf[NoSuchTableException])
    Exceptions thrown

    org.apache.spark.sql.catalyst.analysis.NoSuchTableException If the table does not exist

  2. abstract def clusterBy(colName: String, colNames: String*): DataFrameWriterV2.this.type

    Clusters the output by the given columns on the storage.

    Clusters the output by the given columns on the storage. The rows with matching values in the specified clustering columns will be consolidated within the same group.

    For instance, if you cluster a dataset by date, the data sharing the same date will be stored together in a file. This arrangement improves query efficiency when you apply selective filters to these clustering columns, thanks to data skipping.

    Definition Classes
    DataFrameWriterV2CreateTableWriter
    Annotations
    @varargs()
  3. abstract def create(): Unit

    Create a new table from the contents of the data frame.

    Create a new table from the contents of the data frame.

    The new table's schema, partition layout, properties, and other configuration will be based on the configuration set on this writer.

    If the output table exists, this operation will fail with org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException.

    Definition Classes
    CreateTableWriter
    Annotations
    @throws(classOf[TableAlreadyExistsException])
    Exceptions thrown

    org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException If the table already exists

  4. abstract def createOrReplace(): Unit

    Create a new table or replace an existing table with the contents of the data frame.

    Create a new table or replace an existing table with the contents of the data frame.

    The output table's schema, partition layout, properties, and other configuration will be based on the contents of the data frame and the configuration set on this writer. If the table exists, its configuration and data will be replaced.

    Definition Classes
    CreateTableWriter
  5. abstract def option(key: String, value: String): DataFrameWriterV2.this.type

    Add a write option.

    Add a write option.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  6. abstract def options(options: Map[String, String]): DataFrameWriterV2.this.type

    Add write options from a Java Map.

    Add write options from a Java Map.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  7. abstract def options(options: Map[String, String]): DataFrameWriterV2.this.type

    Add write options from a Scala Map.

    Add write options from a Scala Map.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  8. abstract def overwrite(condition: Column): Unit

    Overwrite rows matching the given filter condition with the contents of the data frame in the output table.

    Overwrite rows matching the given filter condition with the contents of the data frame in the output table.

    If the output table does not exist, this operation will fail with org.apache.spark.sql.catalyst.analysis.NoSuchTableException. The data frame will be validated to ensure it is compatible with the existing table.

    Annotations
    @throws(classOf[NoSuchTableException])
    Exceptions thrown

    org.apache.spark.sql.catalyst.analysis.NoSuchTableException If the table does not exist

  9. abstract def overwritePartitions(): Unit

    Overwrite all partition for which the data frame contains at least one row with the contents of the data frame in the output table.

    Overwrite all partition for which the data frame contains at least one row with the contents of the data frame in the output table.

    This operation is equivalent to Hive's INSERT OVERWRITE ... PARTITION, which replaces partitions dynamically depending on the contents of the data frame.

    If the output table does not exist, this operation will fail with org.apache.spark.sql.catalyst.analysis.NoSuchTableException. The data frame will be validated to ensure it is compatible with the existing table.

    Annotations
    @throws(classOf[NoSuchTableException])
    Exceptions thrown

    org.apache.spark.sql.catalyst.analysis.NoSuchTableException If the table does not exist

  10. abstract def partitionedBy(column: Column, columns: Column*): DataFrameWriterV2.this.type

    Partition the output table created by create, createOrReplace, or replace using the given columns or transforms.

    Partition the output table created by create, createOrReplace, or replace using the given columns or transforms.

    When specified, the table data will be stored by these values for efficient reads.

    For example, when a table is partitioned by day, it may be stored in a directory layout like:

    • table/day=2019-06-01/
    • table/day=2019-06-02/

    Partitioning is one of the most widely used techniques to optimize physical data layout. It provides a coarse-grained index for skipping unnecessary data reads when queries have predicates on the partitioned columns. In order for partitioning to work well, the number of distinct values in each column should typically be less than tens of thousands.

    Definition Classes
    DataFrameWriterV2CreateTableWriter
    Annotations
    @varargs()
  11. abstract def replace(): Unit

    Replace an existing table with the contents of the data frame.

    Replace an existing table with the contents of the data frame.

    The existing table's schema, partition layout, properties, and other configuration will be replaced with the contents of the data frame and the configuration set on this writer.

    If the output table does not exist, this operation will fail with org.apache.spark.sql.catalyst.analysis.CannotReplaceMissingTableException.

    Definition Classes
    CreateTableWriter
    Annotations
    @throws(classOf[CannotReplaceMissingTableException])
    Exceptions thrown

    org.apache.spark.sql.catalyst.analysis.CannotReplaceMissingTableException If the table does not exist

  12. abstract def tableProperty(property: String, value: String): DataFrameWriterV2.this.type

    Add a table property.

    Add a table property.

    Definition Classes
    DataFrameWriterV2CreateTableWriter
  13. abstract def using(provider: String): DataFrameWriterV2.this.type

    Specifies a provider for the underlying output data source.

    Specifies a provider for the underlying output data source. Spark's default catalog supports "parquet", "json", etc.

    Definition Classes
    DataFrameWriterV2CreateTableWriter

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  9. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @IntrinsicCandidate() @native()
  10. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  11. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  12. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  13. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @IntrinsicCandidate() @native()
  14. def option(key: String, value: Double): DataFrameWriterV2.this.type

    Add a double output option.

    Add a double output option.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  15. def option(key: String, value: Long): DataFrameWriterV2.this.type

    Add a long output option.

    Add a long output option.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  16. def option(key: String, value: Boolean): DataFrameWriterV2.this.type

    Add a boolean output option.

    Add a boolean output option.

    Definition Classes
    DataFrameWriterV2WriteConfigMethods
  17. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  18. def toString(): String
    Definition Classes
    AnyRef → Any
  19. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  20. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  21. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

    (Since version 9)

Inherited from CreateTableWriter[T]

Inherited from AnyRef

Inherited from Any

Ungrouped