Class SchemaInferenceUtils
Object
org.apache.spark.sql.pipelines.util.SchemaInferenceUtils
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionstatic scala.collection.immutable.Seq<TableChange>diffSchemas(StructType currentSchema, StructType targetSchema) Determines the column changes needed to transform the current schema into the target schema.static StructTypeinferSchemaFromFlows(scala.collection.immutable.Seq<ResolvedFlow> flows, scala.Option<StructType> userSpecifiedSchema) Given a set of flows that write to the same destination and possibly a user-specified schema, we infer the schema of the destination dataset.
- 
Constructor Details- 
SchemaInferenceUtilspublic SchemaInferenceUtils()
 
- 
- 
Method Details- 
inferSchemaFromFlowspublic static StructType inferSchemaFromFlows(scala.collection.immutable.Seq<ResolvedFlow> flows, scala.Option<StructType> userSpecifiedSchema) Given a set of flows that write to the same destination and possibly a user-specified schema, we infer the schema of the destination dataset. The logic is as follows: 1. If there are no incoming flows, return the user-specified schema (if provided) or an empty schema. 2. If there are incoming flows, we merge the schemas of all flows that write to the same destination. 3. If a user-specified schema is provided, we merge it with the inferred schema. The user-specified schema will take precedence over the inferred schema. Returns an error if encountered during schema inference or merging the inferred schema with the user-specified one.- Parameters:
- flows- (undocumented)
- userSpecifiedSchema- (undocumented)
- Returns:
- (undocumented)
 
- 
diffSchemaspublic static scala.collection.immutable.Seq<TableChange> diffSchemas(StructType currentSchema, StructType targetSchema) Determines the column changes needed to transform the current schema into the target schema.This function compares the current schema with the target schema and produces a sequence of TableChange objects representing: 1. New columns that need to be added 2. Existing columns that need type updates - Parameters:
- currentSchema- The current schema of the table
- targetSchema- The target schema that we want the table to have
- Returns:
- A sequence of TableChange objects representing the necessary changes
 
 
-