Package org.apache.spark.ml.util
Class MetadataUtils
Object
org.apache.spark.ml.util.MetadataUtils
Helper utilities for algorithms using ML metadata
- 
Constructor Summary
Constructors - 
Method Summary
Modifier and TypeMethodDescriptiongetCategoricalFeatures(StructField featuresSchema) Examine a schema to identify categorical (Binary and Nominal) features.static int[]getFeatureIndicesFromNames(StructField col, String[] names) Takes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.static scala.Option<Object>getNumClasses(StructField labelSchema) Examine a schema to identify the number of classes in a label column.static scala.Option<Object>getNumFeatures(StructField vectorSchema) Examine a schema to identify the number of features in a vector column. 
- 
Constructor Details
- 
MetadataUtils
public MetadataUtils() 
 - 
 - 
Method Details
- 
getNumClasses
Examine a schema to identify the number of classes in a label column. Returns None if the number of labels is not specified, or if the label column is continuous.- Parameters:
 labelSchema- (undocumented)- Returns:
 - (undocumented)
 
 - 
getNumFeatures
Examine a schema to identify the number of features in a vector column. Returns None if the number of features is not specified.- Parameters:
 vectorSchema- (undocumented)- Returns:
 - (undocumented)
 
 - 
getCategoricalFeatures
public static scala.collection.immutable.Map<Object,Object> getCategoricalFeatures(StructField featuresSchema) Examine a schema to identify categorical (Binary and Nominal) features.- Parameters:
 featuresSchema- Schema of the features column. If a feature does not have metadata, it is assumed to be continuous. If a feature is Nominal, then it must have the number of values specified.- Returns:
 - Map: feature index to number of categories. The map's set of keys will be the set of categorical feature indices.
 
 - 
getFeatureIndicesFromNames
Takes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.- Parameters:
 col- Vector column which must have feature names specified via attributesnames- List of feature names- Returns:
 - (undocumented)
 
 
 -