Package org.apache.spark.ml
Class Estimator<M extends Model<M>>
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Estimator<M>
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging,- Params,- Identifiable
- Direct Known Subclasses:
- ALS,- BisectingKMeans,- BucketedRandomProjectionLSH,- ChiSqSelector,- CountVectorizer,- CrossValidator,- FPGrowth,- GaussianMixture,- IDF,- Imputer,- IsotonicRegression,- KMeans,- LDA,- MaxAbsScaler,- MinHashLSH,- MinMaxScaler,- OneHotEncoder,- OneVsRest,- PCA,- Pipeline,- Predictor,- QuantileDiscretizer,- RFormula,- RobustScaler,- StandardScaler,- StringIndexer,- TargetEncoder,- TrainValidationSplit,- UnivariateFeatureSelector,- VarianceThresholdSelector,- VectorIndexer,- Word2Vec
Abstract class for estimators that fit models to data.
- See Also:
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionCreates a copy of this instance with the same UID and some extra params.abstract MFits a model to the input data.Fits a single model to the input data with provided parameter map.Fits a single model to the input data with optional parameters.fit(Dataset<?> dataset, ParamPair<?> firstParamPair, scala.collection.immutable.Seq<ParamPair<?>> otherParamPairs) Fits a single model to the input data with optional parameters.scala.collection.immutable.Seq<M>Fits multiple models to the input data with multiple sets of parameters.Methods inherited from class org.apache.spark.ml.PipelineStageparams, transformSchemaMethods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoString, uidMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copyValues, defaultCopy, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, set, set, set, setDefault, setDefault, shouldOwn
- 
Constructor Details- 
Estimatorpublic Estimator()
 
- 
- 
Method Details- 
copyDescription copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
- copyin interface- Params
- Specified by:
- copyin class- PipelineStage
- Parameters:
- extra- (undocumented)
- Returns:
- (undocumented)
 
- 
fitFits a single model to the input data with optional parameters.- Parameters:
- dataset- input dataset
- firstParamPair- the first param pair, overrides embedded params
- otherParamPairs- other param pairs. These values override any specified in this Estimator's embedded ParamMap.
- Returns:
- fitted model
 
- 
fitpublic M fit(Dataset<?> dataset, ParamPair<?> firstParamPair, scala.collection.immutable.Seq<ParamPair<?>> otherParamPairs) Fits a single model to the input data with optional parameters.- Parameters:
- dataset- input dataset
- firstParamPair- the first param pair, overrides embedded params
- otherParamPairs- other param pairs. These values override any specified in this Estimator's embedded ParamMap.
- Returns:
- fitted model
 
- 
fitFits a single model to the input data with provided parameter map.- Parameters:
- dataset- input dataset
- paramMap- Parameter map. These values override any specified in this Estimator's embedded ParamMap.
- Returns:
- fitted model
 
- 
fitFits a model to the input data.- Parameters:
- dataset- (undocumented)
- Returns:
- (undocumented)
 
- 
fitpublic scala.collection.immutable.Seq<M> fit(Dataset<?> dataset, scala.collection.immutable.Seq<ParamMap> paramMaps) Fits multiple models to the input data with multiple sets of parameters. The default implementation uses a for loop on each parameter map. Subclasses could override this to optimize multi-model training.- Parameters:
- dataset- input dataset
- paramMaps- An array of parameter maps. These values override any specified in this Estimator's embedded ParamMap.
- Returns:
- fitted models, matching the input parameter maps
 
 
-