All Known Subinterfaces:: SessionConfigSupport, SupportsCatalogOptions

@Evolving public interface TableProvider

The base interface for v2 data sources which don't have a real catalog. Implementations must have a public, 0-arg constructor.

Note that, TableProvider can only apply data operations to existing tables, like read, append, delete, and overwrite. It does not support the operations that require metadata changes, like create/drop tables.

The major responsibility of this interface is to return a Table for read/write.

Since:: 3.0.0

Method Summary

Modifier and Type

Method

Description

Table

getTable(StructType schema, Transform[] partitioning, Map<String,String> properties)

Return a Table instance with the specified table schema, partitioning and properties to do read/write.

default Transform[]

inferPartitioning(CaseInsensitiveStringMap options)

Infer the partitioning of the table identified by the given options.

StructType

inferSchema(CaseInsensitiveStringMap options)

Infer the schema of the table identified by the given options.

default boolean

supportsExternalMetadata()

Returns true if the source has the ability of accepting external table metadata when getting tables.

Method Details
- inferSchema
  
  StructType inferSchema(CaseInsensitiveStringMap options)
  
  Infer the schema of the table identified by the given options.
  
  Parameters:
  
  options - an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
- inferPartitioning
  
  default Transform[] inferPartitioning(CaseInsensitiveStringMap options)
  
  Infer the partitioning of the table identified by the given options.
  By default this method returns empty partitioning, please override it if this source support partitioning.
  
  Parameters:
  
  options - an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
- getTable
  
  Table getTable(StructType schema, Transform[] partitioning, Map<String,String> properties)
  
  Return a Table instance with the specified table schema, partitioning and properties to do read/write. The returned table should report the same schema and partitioning with the specified ones, or Spark may fail the operation.
  
  Parameters:
  
  schema - The specified table schema.
  
  partitioning - The specified table partitioning.
  
  properties - The specified table properties. It's case preserving (contains exactly what users specified) and implementations are free to use it case sensitively or insensitively. It should be able to identify a table, e.g. file path, Kafka topic name, etc.
- supportsExternalMetadata
  
  default boolean supportsExternalMetadata()
  Returns true if the source has the ability of accepting external table metadata when getting tables. The external table metadata includes:
  
  For table reader: user-specified schema from DataFrameReader/DataStreamReader and schema/partitioning stored in Spark catalog.
  
  For table writer: the schema of the input Dataframe of DataframeWriter/DataStreamWriter.
  
  By default this method returns false, which means the schema and partitioning passed to getTable(StructType, Transform[], Map) are from the infer methods. Please override it if this source has expensive schema/partitioning inference and wants external table metadata to avoid inference.

Interface TableProvider

Method Summary

Method Details

inferSchema

inferPartitioning

getTable

supportsExternalMetadata