@Evolving
public interface TableProvider
Note that, TableProvider can only apply data operations to existing tables, like read, append, delete, and overwrite. It does not support the operations that require metadata changes, like create/drop tables.
The major responsibility of this interface is to return a Table
for read/write.
Modifier and Type | Method and Description |
---|---|
Table |
getTable(StructType schema,
Transform[] partitioning,
java.util.Map<String,String> properties)
Return a
Table instance with the specified table schema, partitioning and properties
to do read/write. |
default Transform[] |
inferPartitioning(CaseInsensitiveStringMap options)
Infer the partitioning of the table identified by the given options.
|
StructType |
inferSchema(CaseInsensitiveStringMap options)
Infer the schema of the table identified by the given options.
|
default boolean |
supportsExternalMetadata()
Returns true if the source has the ability of accepting external table metadata when getting
tables.
|
StructType inferSchema(CaseInsensitiveStringMap options)
options
- an immutable case-insensitive string-to-string map that can identify a table,
e.g. file path, Kafka topic name, etc.default Transform[] inferPartitioning(CaseInsensitiveStringMap options)
By default this method returns empty partitioning, please override it if this source support partitioning.
options
- an immutable case-insensitive string-to-string map that can identify a table,
e.g. file path, Kafka topic name, etc.Table getTable(StructType schema, Transform[] partitioning, java.util.Map<String,String> properties)
Table
instance with the specified table schema, partitioning and properties
to do read/write. The returned table should report the same schema and partitioning with the
specified ones, or Spark may fail the operation.schema
- The specified table schema.partitioning
- The specified table partitioning.properties
- The specified table properties. It's case preserving (contains exactly what
users specified) and implementations are free to use it case sensitively or
insensitively. It should be able to identify a table, e.g. file path, Kafka
topic name, etc.default boolean supportsExternalMetadata()
DataFrameReader
/DataStreamReader
and schema/partitioning stored in Spark catalog.Dataframe
of
DataframeWriter
/DataStreamWriter
.
By default this method returns false, which means the schema and partitioning passed to
getTable(StructType, Transform[], Map)
are from the infer methods. Please override it
if this source has expensive schema/partitioning inference and wants external table metadata
to avoid inference.