Package org.apache.spark.ml.feature
Class ChiSqSelectorModel
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.Model<T>
org.apache.spark.ml.feature.ChiSqSelectorModel
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,SelectorParams,Params,HasFeaturesCol,HasLabelCol,HasOutputCol,Identifiable,MLWritable
Model fitted by
ChiSqSelector.- See Also:
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Method Summary
Modifier and TypeMethodDescriptionstatic scala.Tuple2<int[],double[]> compressSparse(int[] indices, double[] values, int[] selectedFeatures) Creates a copy of this instance with the same UID and some extra params.final DoubleParamfdr()The upper bound of the expected false discovery rate.Param for features column name.final DoubleParamfpr()The highest p-value for features to be kept.final DoubleParamfwe()The upper bound of the expected family-wise error rate.labelCol()Param for label column name.static ChiSqSelectorModelfinal IntParamNumber of features that selector will select, ordered by ascending p-value.Param for output column name.final DoubleParamPercentile of features that selector will select, ordered by ascending p-value.static StructFieldprepOutputField(StructType schema, int[] selectedFeatures, String outputCol, String featuresCol, boolean isNumericAttribute) Prepare the output column field, including per-feature metadata.static MLReader<ChiSqSelectorModel>read()int[]The selector type.setFeaturesCol(String value) setOutputCol(String value) toString()Transforms the input dataset.transformSchema(StructType schema) Check transform validity and derive the output schema from the input schema.uid()An immutable unique ID for the object and its derivatives.write()Returns anMLWriterinstance for this ML instance.Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transformMethods inherited from class org.apache.spark.ml.PipelineStage
paramsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
getFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputCol
getOutputColMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritable
saveMethods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwnMethods inherited from interface org.apache.spark.ml.feature.SelectorParams
getFdr, getFpr, getFwe, getNumTopFeatures, getPercentile, getSelectorType
-
Method Details
-
read
-
load
-
uid
Description copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
uidin interfaceIdentifiable- Returns:
- (undocumented)
-
selectedFeatures
public int[] selectedFeatures() -
setFeaturesCol
-
setOutputCol
-
transformSchema
Description copied from class:PipelineStageCheck transform validity and derive the output schema from the input schema.We check validity for interactions between parameters during
transformSchemaand raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled byParam.validate().Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
- Parameters:
schema- (undocumented)- Returns:
- (undocumented)
-
copy
Description copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
copyin interfaceParams- Specified by:
copyin classModel<ChiSqSelectorModel>- Parameters:
extra- (undocumented)- Returns:
- (undocumented)
-
write
Description copied from interface:MLWritableReturns anMLWriterinstance for this ML instance.- Returns:
- (undocumented)
-
toString
- Specified by:
toStringin interfaceIdentifiable- Overrides:
toStringin classObject
-
prepOutputField
public static StructField prepOutputField(StructType schema, int[] selectedFeatures, String outputCol, String featuresCol, boolean isNumericAttribute) Prepare the output column field, including per-feature metadata.- Parameters:
schema- (undocumented)selectedFeatures- (undocumented)outputCol- (undocumented)featuresCol- (undocumented)isNumericAttribute- (undocumented)- Returns:
- (undocumented)
-
compressSparse
public static scala.Tuple2<int[],double[]> compressSparse(int[] indices, double[] values, int[] selectedFeatures) -
numTopFeatures
Description copied from interface:SelectorParamsNumber of features that selector will select, ordered by ascending p-value. If the number of features is less than numTopFeatures, then this will select all features. Only applicable when selectorType = "numTopFeatures". The default value of numTopFeatures is 50.- Specified by:
numTopFeaturesin interfaceSelectorParams- Returns:
- (undocumented)
-
percentile
Description copied from interface:SelectorParamsPercentile of features that selector will select, ordered by ascending p-value. Only applicable when selectorType = "percentile". Default value is 0.1.- Specified by:
percentilein interfaceSelectorParams- Returns:
- (undocumented)
-
fpr
Description copied from interface:SelectorParamsThe highest p-value for features to be kept. Only applicable when selectorType = "fpr". Default value is 0.05.- Specified by:
fprin interfaceSelectorParams- Returns:
- (undocumented)
-
fdr
Description copied from interface:SelectorParamsThe upper bound of the expected false discovery rate. Only applicable when selectorType = "fdr". Default value is 0.05.- Specified by:
fdrin interfaceSelectorParams- Returns:
- (undocumented)
-
fwe
Description copied from interface:SelectorParamsThe upper bound of the expected family-wise error rate. Only applicable when selectorType = "fwe". Default value is 0.05.- Specified by:
fwein interfaceSelectorParams- Returns:
- (undocumented)
-
selectorType
Description copied from interface:SelectorParamsThe selector type. Supported options: "numTopFeatures" (default), "percentile", "fpr", "fdr", "fwe"- Specified by:
selectorTypein interfaceSelectorParams- Returns:
- (undocumented)
-
outputCol
Description copied from interface:HasOutputColParam for output column name.- Specified by:
outputColin interfaceHasOutputCol- Returns:
- (undocumented)
-
labelCol
Description copied from interface:HasLabelColParam for label column name.- Specified by:
labelColin interfaceHasLabelCol- Returns:
- (undocumented)
-
featuresCol
Description copied from interface:HasFeaturesColParam for features column name.- Specified by:
featuresColin interfaceHasFeaturesCol- Returns:
- (undocumented)
-
transform
Description copied from class:TransformerTransforms the input dataset.- Specified by:
transformin classTransformer- Parameters:
dataset- (undocumented)- Returns:
- (undocumented)
-