trait Scan extends AnyRef
A logical representation of a data source scan. This interface is used to provide logical information, like what the actual read schema is.
This logical representation is shared between batch scan, micro-batch streaming scan and
continuous streaming scan. Data sources must implement the corresponding methods in this
interface, to match what the table promises to support. For example, #toBatch()
must be
implemented, if the Table
that creates this Scan
returns
TableCapability#BATCH_READ
support in its Table#capabilities()
.
- Annotations
- @Evolving()
- Source
- Scan.java
- Since
3.0.0
- Alphabetic
- By Inheritance
- Scan
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def readSchema(): StructType
Returns the actual schema of this data source scan, which may be different from the physical schema of the underlying storage, as column pruning or other optimizations may happen.
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def columnarSupportMode(): ColumnarSupportMode
Subclasses can implement this method to indicate if the support for columnar data should be determined by each partition or is set as a default for the whole scan.
Subclasses can implement this method to indicate if the support for columnar data should be determined by each partition or is set as a default for the whole scan.
- Since
3.5.0
- def description(): String
A description string of this scan, which may includes information like: what filters are configured for this scan, what's the value of some important options like path, etc.
A description string of this scan, which may includes information like: what filters are configured for this scan, what's the value of some important options like path, etc. The description doesn't need to include
#readSchema()
, as Spark already knows it.By default this returns the class name of the implementation. Please override it to provide a meaningful description.
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def reportDriverMetrics(): Array[CustomTaskMetric]
Returns an array of custom metrics which are collected with values at the driver side only.
Returns an array of custom metrics which are collected with values at the driver side only. Note that these metrics must be included in the supported custom metrics reported by
supportedCustomMetrics
.- Since
3.4.0
- def supportedCustomMetrics(): Array[CustomMetric]
Returns an array of supported custom metrics with name and description.
Returns an array of supported custom metrics with name and description. By default it returns empty array.
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toBatch(): Batch
Returns the physical representation of this scan for batch query.
Returns the physical representation of this scan for batch query. By default this method throws exception, data sources must overwrite this method to provide an implementation, if the
Table
that creates this scan returnsTableCapability#BATCH_READ
support in itsTable#capabilities()
.If the scan supports runtime filtering and implements
SupportsRuntimeFiltering
, this method may be called multiple times. Therefore, implementations can cache some state to avoid planning the job twice.- Exceptions thrown
- def toContinuousStream(checkpointLocation: String): ContinuousStream
Returns the physical representation of this scan for streaming query with continuous mode.
Returns the physical representation of this scan for streaming query with continuous mode. By default this method throws exception, data sources must overwrite this method to provide an implementation, if the
Table
that creates this scan returnsTableCapability#CONTINUOUS_READ
support in itsTable#capabilities()
.- checkpointLocation
a path to Hadoop FS scratch space that can be used for failure recovery. Data streams for the same logical source in the same query will be given the same checkpointLocation.
- Exceptions thrown
- def toMicroBatchStream(checkpointLocation: String): MicroBatchStream
Returns the physical representation of this scan for streaming query with micro-batch mode.
Returns the physical representation of this scan for streaming query with micro-batch mode. By default this method throws exception, data sources must overwrite this method to provide an implementation, if the
Table
that creates this scan returnsTableCapability#MICRO_BATCH_READ
support in itsTable#capabilities()
.- checkpointLocation
a path to Hadoop FS scratch space that can be used for failure recovery. Data streams for the same logical source in the same query will be given the same checkpointLocation.
- Exceptions thrown
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)