org.apache.spark.ml
Class Estimator<M extends Model<M>>

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Estimator<M>
All Implemented Interfaces:
java.io.Serializable, Logging, Params
Direct Known Subclasses:
ALS, CrossValidator, IDF, OneVsRest, Pipeline, Predictor, StandardScaler, StringIndexer, VectorIndexer, Word2Vec

public abstract class Estimator<M extends Model<M>>
extends PipelineStage

:: DeveloperApi :: Abstract class for estimators that fit models to data.

See Also:
Serialized Form

Constructor Summary
Estimator()
           
 
Method Summary
abstract  Estimator<M> copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
abstract  M fit(DataFrame dataset)
          Fits a model to the input data.
 M fit(DataFrame dataset, ParamMap paramMap)
          Fits a single model to the input data with provided parameter map.
 scala.collection.Seq<M> fit(DataFrame dataset, ParamMap[] paramMaps)
          Fits multiple models to the input data with multiple sets of parameters.
 M fit(DataFrame dataset, ParamPair<?> firstParamPair, ParamPair<?>... otherParamPairs)
          Fits a single model to the input data with optional parameters.
 M fit(DataFrame dataset, ParamPair<?> firstParamPair, scala.collection.Seq<ParamPair<?>> otherParamPairs)
          Fits a single model to the input data with optional parameters.
 
Methods inherited from class org.apache.spark.ml.PipelineStage
transformSchema
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

Estimator

public Estimator()
Method Detail

fit

public M fit(DataFrame dataset,
             ParamPair<?> firstParamPair,
             ParamPair<?>... otherParamPairs)
Fits a single model to the input data with optional parameters.

Parameters:
dataset - input dataset
firstParamPair - the first param pair, overrides embedded params
otherParamPairs - other param pairs. These values override any specified in this Estimator's embedded ParamMap.
Returns:
fitted model

fit

public M fit(DataFrame dataset,
             ParamPair<?> firstParamPair,
             scala.collection.Seq<ParamPair<?>> otherParamPairs)
Fits a single model to the input data with optional parameters.

Parameters:
dataset - input dataset
firstParamPair - the first param pair, overrides embedded params
otherParamPairs - other param pairs. These values override any specified in this Estimator's embedded ParamMap.
Returns:
fitted model

fit

public M fit(DataFrame dataset,
             ParamMap paramMap)
Fits a single model to the input data with provided parameter map.

Parameters:
dataset - input dataset
paramMap - Parameter map. These values override any specified in this Estimator's embedded ParamMap.
Returns:
fitted model

fit

public abstract M fit(DataFrame dataset)
Fits a model to the input data.

Parameters:
dataset - (undocumented)
Returns:
(undocumented)

fit

public scala.collection.Seq<M> fit(DataFrame dataset,
                                   ParamMap[] paramMaps)
Fits multiple models to the input data with multiple sets of parameters. The default implementation uses a for loop on each parameter map. Subclasses could override this to optimize multi-model training.

Parameters:
dataset - input dataset
paramMaps - An array of parameter maps. These values override any specified in this Estimator's embedded ParamMap.
Returns:
fitted models, matching the input parameter maps

copy

public abstract Estimator<M> copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class PipelineStage
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()