Package org.apache.spark.ml.feature
Class ChiSqSelectorModel
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.Model<T>
org.apache.spark.ml.feature.ChiSqSelectorModel
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging,- SelectorParams,- Params,- HasFeaturesCol,- HasLabelCol,- HasOutputCol,- Identifiable,- MLWritable
Model fitted by 
ChiSqSelector.- See Also:
- 
Nested Class SummaryNested ClassesModifier and TypeClassDescriptionstatic classstatic classNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Method SummaryModifier and TypeMethodDescriptionstatic scala.Tuple2<int[],double[]> compressSparse(int[] indices, double[] values, int[] selectedFeatures) Creates a copy of this instance with the same UID and some extra params.final DoubleParamfdr()The upper bound of the expected false discovery rate.Param for features column name.final DoubleParamfpr()The highest p-value for features to be kept.final DoubleParamfwe()The upper bound of the expected family-wise error rate.labelCol()Param for label column name.static ChiSqSelectorModelfinal IntParamNumber of features that selector will select, ordered by ascending p-value.Param for output column name.final DoubleParamPercentile of features that selector will select, ordered by ascending p-value.static StructFieldprepOutputField(StructType schema, int[] selectedFeatures, String outputCol, String featuresCol, boolean isNumericAttribute) Prepare the output column field, including per-feature metadata.static MLReader<ChiSqSelectorModel>read()int[]The selector type.setFeaturesCol(String value) setOutputCol(String value) toString()Transforms the input dataset.transformSchema(StructType schema) Check transform validity and derive the output schema from the input schema.uid()An immutable unique ID for the object and its derivatives.write()Returns anMLWriterinstance for this ML instance.Methods inherited from class org.apache.spark.ml.Transformertransform, transform, transformMethods inherited from class org.apache.spark.ml.PipelineStageparamsMethods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesColgetFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasLabelColgetLabelColMethods inherited from interface org.apache.spark.ml.param.shared.HasOutputColgetOutputColMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritablesaveMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwnMethods inherited from interface org.apache.spark.ml.feature.SelectorParamsgetFdr, getFpr, getFwe, getNumTopFeatures, getPercentile, getSelectorType
- 
Method Details- 
read
- 
load
- 
uidDescription copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
- uidin interface- Identifiable
- Returns:
- (undocumented)
 
- 
selectedFeaturespublic int[] selectedFeatures()
- 
setFeaturesCol
- 
setOutputCol
- 
transformSchemaDescription copied from class:PipelineStageCheck transform validity and derive the output schema from the input schema.We check validity for interactions between parameters during transformSchemaand raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled byParam.validate().Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks. - Parameters:
- schema- (undocumented)
- Returns:
- (undocumented)
 
- 
copyDescription copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
- copyin interface- Params
- Specified by:
- copyin class- Model<ChiSqSelectorModel>
- Parameters:
- extra- (undocumented)
- Returns:
- (undocumented)
 
- 
writeDescription copied from interface:MLWritableReturns anMLWriterinstance for this ML instance.- Returns:
- (undocumented)
 
- 
toString- Specified by:
- toStringin interface- Identifiable
- Overrides:
- toStringin class- Object
 
- 
prepOutputFieldpublic static StructField prepOutputField(StructType schema, int[] selectedFeatures, String outputCol, String featuresCol, boolean isNumericAttribute) Prepare the output column field, including per-feature metadata.- Parameters:
- schema- (undocumented)
- selectedFeatures- (undocumented)
- outputCol- (undocumented)
- featuresCol- (undocumented)
- isNumericAttribute- (undocumented)
- Returns:
- (undocumented)
 
- 
compressSparsepublic static scala.Tuple2<int[],double[]> compressSparse(int[] indices, double[] values, int[] selectedFeatures) 
- 
numTopFeaturesDescription copied from interface:SelectorParamsNumber of features that selector will select, ordered by ascending p-value. If the number of features is less than numTopFeatures, then this will select all features. Only applicable when selectorType = "numTopFeatures". The default value of numTopFeatures is 50.- Specified by:
- numTopFeaturesin interface- SelectorParams
- Returns:
- (undocumented)
 
- 
percentileDescription copied from interface:SelectorParamsPercentile of features that selector will select, ordered by ascending p-value. Only applicable when selectorType = "percentile". Default value is 0.1.- Specified by:
- percentilein interface- SelectorParams
- Returns:
- (undocumented)
 
- 
fprDescription copied from interface:SelectorParamsThe highest p-value for features to be kept. Only applicable when selectorType = "fpr". Default value is 0.05.- Specified by:
- fprin interface- SelectorParams
- Returns:
- (undocumented)
 
- 
fdrDescription copied from interface:SelectorParamsThe upper bound of the expected false discovery rate. Only applicable when selectorType = "fdr". Default value is 0.05.- Specified by:
- fdrin interface- SelectorParams
- Returns:
- (undocumented)
 
- 
fweDescription copied from interface:SelectorParamsThe upper bound of the expected family-wise error rate. Only applicable when selectorType = "fwe". Default value is 0.05.- Specified by:
- fwein interface- SelectorParams
- Returns:
- (undocumented)
 
- 
selectorTypeDescription copied from interface:SelectorParamsThe selector type. Supported options: "numTopFeatures" (default), "percentile", "fpr", "fdr", "fwe"- Specified by:
- selectorTypein interface- SelectorParams
- Returns:
- (undocumented)
 
- 
outputColDescription copied from interface:HasOutputColParam for output column name.- Specified by:
- outputColin interface- HasOutputCol
- Returns:
- (undocumented)
 
- 
labelColDescription copied from interface:HasLabelColParam for label column name.- Specified by:
- labelColin interface- HasLabelCol
- Returns:
- (undocumented)
 
- 
featuresColDescription copied from interface:HasFeaturesColParam for features column name.- Specified by:
- featuresColin interface- HasFeaturesCol
- Returns:
- (undocumented)
 
- 
transformDescription copied from class:TransformerTransforms the input dataset.- Specified by:
- transformin class- Transformer
- Parameters:
- dataset- (undocumented)
- Returns:
- (undocumented)
 
 
-