org.apache.spark.ml.feature
Class StandardScaler

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Estimator<StandardScalerModel>
          extended by org.apache.spark.ml.feature.StandardScaler
All Implemented Interfaces:
java.io.Serializable, Logging, Params

public class StandardScaler
extends Estimator<StandardScalerModel>

:: Experimental :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.

See Also:
Serialized Form

Constructor Summary
StandardScaler()
           
StandardScaler(String uid)
           
 
Method Summary
 StandardScaler copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
 StandardScalerModel fit(DataFrame dataset)
          Fits a model to the input data.
 StandardScaler setInputCol(String value)
           
 StandardScaler setOutputCol(String value)
           
 StandardScaler setWithMean(boolean value)
           
 StandardScaler setWithStd(boolean value)
           
 StructType transformSchema(StructType schema)
          :: DeveloperApi ::
 String uid()
           
 BooleanParam withMean()
          Centers the data with mean before scaling.
 BooleanParam withStd()
          Scales the data to unit standard deviation.
 
Methods inherited from class org.apache.spark.ml.Estimator
fit, fit, fit, fit
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

StandardScaler

public StandardScaler(String uid)

StandardScaler

public StandardScaler()
Method Detail

uid

public String uid()

setInputCol

public StandardScaler setInputCol(String value)

setOutputCol

public StandardScaler setOutputCol(String value)

setWithMean

public StandardScaler setWithMean(boolean value)

setWithStd

public StandardScaler setWithStd(boolean value)

fit

public StandardScalerModel fit(DataFrame dataset)
Description copied from class: Estimator
Fits a model to the input data.

Specified by:
fit in class Estimator<StandardScalerModel>
Parameters:
dataset - (undocumented)
Returns:
(undocumented)

transformSchema

public StructType transformSchema(StructType schema)
Description copied from class: PipelineStage
:: DeveloperApi ::

Derives the output schema from the input schema.

Specified by:
transformSchema in class PipelineStage
Parameters:
schema - (undocumented)
Returns:
(undocumented)

copy

public StandardScaler copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class Estimator<StandardScalerModel>
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()

withMean

public BooleanParam withMean()
Centers the data with mean before scaling. It will build a dense output, so this does not work on sparse input and will raise an exception. Default: false

Returns:
(undocumented)

withStd

public BooleanParam withStd()
Scales the data to unit standard deviation. Default: true

Returns:
(undocumented)