Interface SupportsMetadataColumns

All Superinterfaces:
Table

@Evolving public interface SupportsMetadataColumns extends Table
An interface for exposing data columns for a table that are not in the table schema. For example, a file source could expose a "file" column that contains the path of the file that contained each row.

The columns returned by metadataColumns() may be passed as StructField in requested projections. Sources that implement this interface and column projection using SupportsPushDownRequiredColumns must accept metadata fields passed to SupportsPushDownRequiredColumns.pruneColumns(StructType).

If a table column and a metadata column have the same name, the conflict is resolved by either renaming or suppressing the metadata column. See canRenameConflictingMetadataColumns().

Since:
3.1.0
  • Method Details

    • metadataColumns

      MetadataColumn[] metadataColumns()
      Metadata columns that are supported by this Table.

      The columns returned by this method may be passed as StructField in requested projections using SupportsPushDownRequiredColumns.pruneColumns(StructType).

      If a table column and a metadata column have the same name, the conflict is resolved by either renaming or suppressing the metadata column. See canRenameConflictingMetadataColumns().

      Returns:
      an array of MetadataColumn
    • canRenameConflictingMetadataColumns

      default boolean canRenameConflictingMetadataColumns()
      Determines how this data source handles name conflicts between metadata and data columns.

      If true, spark will automatically rename the metadata column to resolve the conflict. End users can reliably select metadata columns (renamed or not) with Dataset.metadataColumn, and internal code can use MetadataAttributeWithLogicalName to extract the logical name from a metadata attribute.

      If false, the data column will hide the metadata column. It is recommended that Table implementations which do not support renaming should reject data column names that conflict with metadata column names.