Package org.apache.spark.sql.sources
Interface SupportsStreamSourceMetadataColumns
- All Superinterfaces:
StreamSourceProvider
Implemented by StreamSourceProvider objects that can generate file metadata columns.
This trait extends the basic StreamSourceProvider by allowing the addition of metadata
columns to the schema of the Stream Data Source.
-
Method Summary
Modifier and TypeMethodDescriptionscala.collection.immutable.Seq<org.apache.spark.sql.catalyst.expressions.AttributeReference>
getMetadataOutput
(SparkSession spark, scala.collection.immutable.Map<String, String> options, scala.Option<StructType> userSpecifiedSchema) Returns the metadata columns that should be added to the schema of the Stream Source.Methods inherited from interface org.apache.spark.sql.sources.StreamSourceProvider
createSource, sourceSchema
-
Method Details
-
getMetadataOutput
scala.collection.immutable.Seq<org.apache.spark.sql.catalyst.expressions.AttributeReference> getMetadataOutput(SparkSession spark, scala.collection.immutable.Map<String, String> options, scala.Option<StructType> userSpecifiedSchema) Returns the metadata columns that should be added to the schema of the Stream Source. These metadata columns supplement the columns defined in the sourceSchema() of the StreamSourceProvider.The final schema for the Stream Source, therefore, consists of the source schema as defined by StreamSourceProvider.sourceSchema(), with the metadata columns added at the end. The caller is responsible for resolving any naming conflicts with the source schema.
An example of using this streaming source metadata output interface is when a customized file-based streaming source needs to expose file metadata columns, leveraging the hidden file metadata columns from its underlying storage format.
- Parameters:
spark
- The SparkSession used for the operation.options
- A map of options of the Stream Data Source.userSpecifiedSchema
- An optional user-provided schema of the Stream Data Source.- Returns:
- A Seq of AttributeReference representing the metadata output attributes.
-