RequiresDistributionAndOrdering

trait RequiresDistributionAndOrdering extends Write

A write that requires a specific distribution and ordering of data.

Annotations: @Experimental()
Source: RequiresDistributionAndOrdering.java
Since: 3.2.0

Linear Supertypes

Write, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

RequiresDistributionAndOrdering
Write
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Abstract Value Members

abstract def requiredDistribution(): Distribution
Returns the distribution required by this write.
Returns the distribution required by this write.
Spark will distribute incoming records across partitions to satisfy the required distribution before passing the records to the data source table on write.
Batch and micro-batch writes can request a particular data distribution. If a distribution is requested in the micro-batch context, incoming records in each micro batch will satisfy the required distribution (but not across micro batches). The continuous execution mode continuously processes streaming data and does not support distribution requirements.
Implementations may return UnspecifiedDistribution if they don't require any specific distribution of data on write.
returns
the required distribution
abstract def requiredOrdering(): Array[SortOrder]
Returns the ordering required by this write.
Returns the ordering required by this write.
Spark will order incoming records within partitions to satisfy the required ordering before passing those records to the data source table on write.
Batch and micro-batch writes can request a particular data ordering. If an ordering is requested in the micro-batch context, incoming records in each micro batch will satisfy the required ordering (but not across micro batches). The continuous execution mode continuously processes streaming data and does not support ordering requirements.
Implementations may return an empty array if they don't require any specific ordering of data on write.
returns
the required ordering

Concrete Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
def advisoryPartitionSizeInBytes(): Long
Returns the advisory (not guaranteed) shuffle partition size in bytes for this write.
Returns the advisory (not guaranteed) shuffle partition size in bytes for this write.
Implementations may override this to indicate the preferable partition size in shuffles performed to satisfy the requested distribution. Note that Spark doesn't support setting the advisory partition size for UnspecifiedDistribution, the query will fail if the advisory partition size is set but the distribution is unspecified. Data sources may either request a particular number of partitions via #requiredNumPartitions() or a preferred partition size, not both.
Data sources should be careful with large advisory sizes as it will impact the writing parallelism and may degrade the overall job performance.
Note this value only acts like a guidance and Spark does not guarantee the actual and advisory shuffle partition sizes will match. Ignored if the adaptive execution is disabled.
returns
the advisory partition size, any value less than 1 means no preference.
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
def description(): String
Returns the description associated with this write.
Returns the description associated with this write.
Definition Classes
Write
def distributionStrictlyRequired(): Boolean
Returns if the distribution required by this write is strictly required or best effort only.
Returns if the distribution required by this write is strictly required or best effort only.
If true, Spark will strictly distribute incoming records across partitions to satisfy the required distribution before passing the records to the data source table on write. Otherwise, Spark may apply certain optimizations to speed up the query but break the distribution requirement.
returns
true if the distribution required by this write is strictly required; false otherwise.
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
def reportDriverMetrics(): Array[CustomTaskMetric]
Returns an array of custom metrics which are collected with values at the driver side only.
Returns an array of custom metrics which are collected with values at the driver side only. Note that these metrics must be included in the supported custom metrics reported by supportedCustomMetrics.
Definition Classes
Write
def requiredNumPartitions(): Int
Returns the number of partitions required by this write.
Returns the number of partitions required by this write.
Implementations may override this to require a specific number of input partitions.
Note that Spark doesn't support the number of partitions on UnspecifiedDistribution, the query will fail if the number of partitions are provided but the distribution is unspecified. Data sources may either request a particular number of partitions or a preferred partition size via #advisoryPartitionSizeInBytes, not both.
returns
the required number of partitions, any value less than 1 mean no requirement.
def supportedCustomMetrics(): Array[CustomMetric]
Returns an array of supported custom metrics with name and description.
Returns an array of supported custom metrics with name and description. By default it returns empty array.
Definition Classes
Write
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toBatch(): BatchWrite
Returns a BatchWrite to write data to batch source.
Returns a BatchWrite to write data to batch source. By default this method throws exception, data sources must overwrite this method to provide an implementation, if the Table that creates this write returns TableCapability#BATCH_WRITE support in its Table#capabilities().
Definition Classes
Write
def toStreaming(): StreamingWrite
Returns a StreamingWrite to write data to streaming source.
Returns a StreamingWrite to write data to streaming source. By default this method throws exception, data sources must overwrite this method to provide an implementation, if the Table that creates this write returns TableCapability#STREAMING_WRITE support in its Table#capabilities().
Definition Classes
Write
def toString(): String
Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated
(Since version 9)

Packages

RequiresDistributionAndOrdering

trait RequiresDistributionAndOrdering extends Write

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from Write

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

RequiresDistributionAndOrdering

trait RequiresDistributionAndOrdering extends Write

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from Write

Inherited from AnyRef

Inherited from Any

Ungrouped

RequiresDistributionAndOrdering