Package org.apache.spark.ml.feature
Interface StringIndexerBase
- All Superinterfaces:
- HasHandleInvalid,- HasInputCol,- HasInputCols,- HasOutputCol,- HasOutputCols,- Identifiable,- Params,- Serializable
- All Known Implementing Classes:
- StringIndexer,- StringIndexerModel
public interface StringIndexerBase
extends Params, HasHandleInvalid, HasInputCol, HasOutputCol, HasInputCols, HasOutputCols
Base trait for 
StringIndexer and StringIndexerModel.- 
Method SummaryModifier and TypeMethodDescriptionReturns the input and output column names corresponding in pair.Param for how to handle invalid data (unseen labels or NULL values).Param for how to order labels of string column.validateAndTransformField(StructType schema, String inputColName, DataType inputDataType, String outputColName) validateAndTransformSchema(StructType schema, boolean skipNonExistsCol) Validates and transforms the input schema.Methods inherited from interface org.apache.spark.ml.param.shared.HasHandleInvalidgetHandleInvalidMethods inherited from interface org.apache.spark.ml.param.shared.HasInputColgetInputCol, inputColMethods inherited from interface org.apache.spark.ml.param.shared.HasInputColsgetInputCols, inputColsMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColgetOutputCol, outputColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColsgetOutputCols, outputColsMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoString, uidMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copy, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
- 
Method Details- 
getInOutColsReturns the input and output column names corresponding in pair.
- 
getStringOrderTypeString getStringOrderType()
- 
handleInvalidParam for how to handle invalid data (unseen labels or NULL values). Options are 'skip' (filter out rows with invalid data), 'error' (throw an error), or 'keep' (put invalid data in a special additional bucket, at index numLabels). Default: "error"- Specified by:
- handleInvalidin interface- HasHandleInvalid
- Returns:
- (undocumented)
 
- 
stringOrderTypeParam for how to order labels of string column. The first label after ordering is assigned an index of 0. Options are: - 'frequencyDesc': descending order by label frequency (most frequent label assigned 0) - 'frequencyAsc': ascending order by label frequency (least frequent label assigned 0) - 'alphabetDesc': descending alphabetical order - 'alphabetAsc': ascending alphabetical order Default is 'frequencyDesc'.Note: In case of equal frequency when under frequencyDesc/Asc, the strings are further sorted alphabetically. - Returns:
- (undocumented)
 
- 
validateAndTransformFieldStructField validateAndTransformField(StructType schema, String inputColName, DataType inputDataType, String outputColName) 
- 
validateAndTransformSchemaValidates and transforms the input schema.
 
-