Package org.apache.spark.ml.util
Class SchemaUtils
Object
org.apache.spark.ml.util.SchemaUtils
Utils for handling schemas.
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionstatic StructTypeappendColumn(StructType schema, String colName, DataType dataType, boolean nullable) Appends a new column to the input schema.static StructTypeappendColumn(StructType schema, StructField col) Appends a new column to the input schema.static voidcheckColumnType(StructType schema, String colName, DataType dataType, String msg) Check whether the given schema contains a column of the required data type.static voidcheckColumnTypes(StructType schema, String colName, scala.collection.immutable.Seq<DataType> dataTypes, String msg) Check whether the given schema contains a column of one of the require data types.static voidcheckNumericType(StructType schema, String colName, String msg) Check whether the given schema contains a column of the numeric data type.static booleancheckSchemaFieldExist(StructType schema, String colName) Check whether a certain column name exists in the schema.static StructFieldgetSchemaField(StructType schema, String colName) Get schema field.static DataTypegetSchemaFieldType(StructType schema, String colName) Get schema field type.static Stringstatic StructTypeupdateAttributeGroupSize(StructType schema, String colName, int size) Update the size of a ML Vector column.static StructTypeupdateField(StructType schema, StructField field, boolean overwriteMetadata) Update the metadata of an existing column.static StructTypeupdateNumeric(StructType schema, String colName) Update the numeric meta of an existing column.static StructTypeupdateNumValues(StructType schema, String colName, int numValues) Update the number of values of an existing column.static voidvalidateVectorCompatibleColumn(StructType schema, String colName) Check whether the given column in the schema is one of the supporting vector type: Vector, Array[Float].
- 
Constructor Details- 
SchemaUtilspublic SchemaUtils()
 
- 
- 
Method Details- 
checkColumnTypepublic static void checkColumnType(StructType schema, String colName, DataType dataType, String msg) Check whether the given schema contains a column of the required data type.- Parameters:
- colName- column name
- dataType- required column data type
- schema- (undocumented)
- msg- (undocumented)
 
- 
checkColumnTypespublic static void checkColumnTypes(StructType schema, String colName, scala.collection.immutable.Seq<DataType> dataTypes, String msg) Check whether the given schema contains a column of one of the require data types.- Parameters:
- colName- column name
- dataTypes- required column data types
- schema- (undocumented)
- msg- (undocumented)
 
- 
checkNumericTypeCheck whether the given schema contains a column of the numeric data type.- Parameters:
- colName- column name
- schema- (undocumented)
- msg- (undocumented)
 
- 
appendColumnpublic static StructType appendColumn(StructType schema, String colName, DataType dataType, boolean nullable) Appends a new column to the input schema. This fails if the given output column already exists.- Parameters:
- schema- input schema
- colName- new column name. If this column name is an empty string "", this method returns the input schema unchanged. This allows users to disable output columns.
- dataType- new column data type
- nullable- (undocumented)
- Returns:
- new schema with the input column appended
 
- 
appendColumnAppends a new column to the input schema. This fails if the given output column already exists.- Parameters:
- schema- input schema
- col- New column schema
- Returns:
- new schema with the input column appended
 
- 
updateAttributeGroupSizeUpdate the size of a ML Vector column. If this column do not exist, append it.- Parameters:
- schema- input schema
- colName- column name
- size- number of features
- Returns:
- new schema
 
- 
updateNumValuesUpdate the number of values of an existing column. If this column do not exist, append it.- Parameters:
- schema- input schema
- colName- column name
- numValues- number of values.
- Returns:
- new schema
 
- 
updateNumericUpdate the numeric meta of an existing column. If this column do not exist, append it.- Parameters:
- schema- input schema
- colName- column name
- Returns:
- new schema
 
- 
updateFieldpublic static StructType updateField(StructType schema, StructField field, boolean overwriteMetadata) Update the metadata of an existing column. If this column do not exist, append it.- Parameters:
- schema- input schema
- field- struct field
- overwriteMetadata- whether to overwrite the metadata. If true, the metadata in the schema will be overwritten. If false, the metadata in- fieldand- schemawill be merged to generate output metadata.
- Returns:
- new schema
 
- 
validateVectorCompatibleColumnCheck whether the given column in the schema is one of the supporting vector type: Vector, Array[Float]. Array[Double]- Parameters:
- schema- input schema
- colName- column name
 
- 
toSQLId
- 
getSchemaFieldGet schema field.- Parameters:
- schema- input schema
- colName- column name, nested column name is supported.
- Returns:
- (undocumented)
 
- 
getSchemaFieldTypeGet schema field type.- Parameters:
- schema- input schema
- colName- column name, nested column name is supported.
- Returns:
- (undocumented)
 
- 
checkSchemaFieldExistCheck whether a certain column name exists in the schema.- Parameters:
- schema- input schema
- colName- column name, nested column name is supported.
- Returns:
- (undocumented)
 
 
-