Interface TableCatalog

All Superinterfaces:
CatalogPlugin
All Known Subinterfaces:
CatalogExtension, StagingTableCatalog
All Known Implementing Classes:
DelegatingCatalogExtension

@Evolving public interface TableCatalog extends CatalogPlugin
Catalog methods for working with Tables.

TableCatalog implementations may be case sensitive or case insensitive. Spark will pass table identifiers without modification. Field names passed to alterTable(Identifier, TableChange...) will be normalized to match the case used in the table schema when updating, renaming, or dropping existing columns when catalyst analysis is case insensitive.

Since:
3.0.0
  • Field Details

    • PROP_LOCATION

      static final String PROP_LOCATION
      A reserved property to specify the location of the table. The files of the table should be under this location.
      See Also:
    • PROP_IS_MANAGED_LOCATION

      static final String PROP_IS_MANAGED_LOCATION
      A reserved property to indicate that the table location is managed, not user-specified. If this property is "true", SHOW CREATE TABLE will not generate the LOCATION clause.
      See Also:
    • PROP_EXTERNAL

      static final String PROP_EXTERNAL
      A reserved property to specify a table was created with EXTERNAL.
      See Also:
    • PROP_COMMENT

      static final String PROP_COMMENT
      A reserved property to specify the description of the table.
      See Also:
    • PROP_PROVIDER

      static final String PROP_PROVIDER
      A reserved property to specify the provider of the table.
      See Also:
    • PROP_OWNER

      static final String PROP_OWNER
      A reserved property to specify the owner of the table.
      See Also:
    • OPTION_PREFIX

      static final String OPTION_PREFIX
      A prefix used to pass OPTIONS in table properties
      See Also:
  • Method Details

    • capabilities

      default Set<TableCatalogCapability> capabilities()
      Returns:
      the set of capabilities for this TableCatalog
    • listTables

      Identifier[] listTables(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      List the tables in a namespace from the catalog.

      If the catalog supports views, this must return identifiers for only tables and not views.

      Parameters:
      namespace - a multi-part namespace
      Returns:
      an array of Identifiers for tables
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - If the namespace does not exist (optional).
    • loadTable

      Table loadTable(Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Load table metadata by identifier from the catalog.

      If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

      Parameters:
      ident - a table identifier
      Returns:
      the table's metadata
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException - If the table doesn't exist or is a view
    • loadTable

      default Table loadTable(Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Load table metadata of a specific version by identifier from the catalog.

      If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

      Parameters:
      ident - a table identifier
      version - version of the table
      Returns:
      the table's metadata
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException - If the table doesn't exist or is a view
    • loadTable

      default Table loadTable(Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Load table metadata at a specific time by identifier from the catalog.

      If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

      Parameters:
      ident - a table identifier
      timestamp - timestamp of the table, which is microseconds since 1970-01-01 00:00:00 UTC
      Returns:
      the table's metadata
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException - If the table doesn't exist or is a view
    • invalidateTable

      default void invalidateTable(Identifier ident)
      Invalidate cached table metadata for an identifier.

      If the table is already loaded or cached, drop cached data. If the table does not exist or is not cached, do nothing. Calling this method should not query remote services.

      Parameters:
      ident - a table identifier
    • tableExists

      default boolean tableExists(Identifier ident)
      Test whether a table exists using an identifier from the catalog.

      If the catalog supports views and contains a view for the identifier and not a table, this must return false.

      Parameters:
      ident - a table identifier
      Returns:
      true if the table exists, false otherwise
    • createTable

      @Deprecated Table createTable(Identifier ident, StructType schema, Transform[] partitions, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Deprecated.
      Create a table in the catalog.

      This is deprecated. Please override createTable(Identifier, Column[], Transform[], Map) instead.

      Throws:
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
    • createTable

      default Table createTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String,String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
      Create a table in the catalog.
      Parameters:
      ident - a table identifier
      columns - the columns of the new table.
      partitions - transforms to use for partitioning data in the table
      properties - a string map of table properties
      Returns:
      metadata for the new table
      Throws:
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException - If a table or view already exists for the identifier
      UnsupportedOperationException - If a requested partition transform is not supported
      org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException - If the identifier namespace does not exist (optional)
    • useNullableQuerySchema

      default boolean useNullableQuerySchema()
      If true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ... AS SELECT ... and creating the table.
    • alterTable

      Table alterTable(Identifier ident, TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException
      Apply a set of changes to a table in the catalog.

      Implementations may reject the requested changes. If any change is rejected, none of the changes should be applied to the table.

      The requested changes must be applied in the order given.

      If the catalog supports views and contains a view for the identifier and not a table, this must throw NoSuchTableException.

      Parameters:
      ident - a table identifier
      changes - changes to apply to the table
      Returns:
      updated metadata for the table
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException - If the table doesn't exist or is a view
      IllegalArgumentException - If any change is rejected by the implementation.
    • dropTable

      boolean dropTable(Identifier ident)
      Drop a table in the catalog.

      If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.

      Parameters:
      ident - a table identifier
      Returns:
      true if a table was deleted, false if no table exists for the identifier
    • purgeTable

      default boolean purgeTable(Identifier ident) throws UnsupportedOperationException
      Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.

      If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.

      If the catalog supports to purge a table, this method should be overridden. The default implementation throws UnsupportedOperationException.

      Parameters:
      ident - a table identifier
      Returns:
      true if a table was deleted, false if no table exists for the identifier
      Throws:
      UnsupportedOperationException - If table purging is not supported
      Since:
      3.1.0
    • renameTable

      void renameTable(Identifier oldIdent, Identifier newIdent) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
      Renames a table in the catalog.

      If the catalog supports views and contains a view for the old identifier and not a table, this throws NoSuchTableException. Additionally, if the new identifier is a table or a view, this throws TableAlreadyExistsException.

      If the catalog does not support table renames between namespaces, it throws UnsupportedOperationException.

      Parameters:
      oldIdent - the table identifier of the existing table to rename
      newIdent - the new table identifier of the table
      Throws:
      org.apache.spark.sql.catalyst.analysis.NoSuchTableException - If the table to rename doesn't exist or is a view
      org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException - If the new table name already exists or is a view
      UnsupportedOperationException - If the namespaces of old and new identifiers do not match (optional)