org.apache.spark.ml.feature
Class Binarizer

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Transformer
          extended by org.apache.spark.ml.feature.Binarizer
All Implemented Interfaces:
java.io.Serializable, Logging, Params

public final class Binarizer
extends Transformer

:: Experimental :: Binarize a column of continuous features given a threshold.

See Also:
Serialized Form

Constructor Summary
Binarizer()
           
Binarizer(String uid)
           
 
Method Summary
 Binarizer copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
 double getThreshold()
           
 Binarizer setInputCol(String value)
           
 Binarizer setOutputCol(String value)
           
 Binarizer setThreshold(double value)
           
 DoubleParam threshold()
          Param for threshold used to binarize continuous features.
 DataFrame transform(DataFrame dataset)
          Transforms the input dataset.
 StructType transformSchema(StructType schema)
          :: DeveloperApi ::
 String uid()
           
 
Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

Binarizer

public Binarizer(String uid)

Binarizer

public Binarizer()
Method Detail

uid

public String uid()

threshold

public DoubleParam threshold()
Param for threshold used to binarize continuous features. The features greater than the threshold, will be binarized to 1.0. The features equal to or less than the threshold, will be binarized to 0.0.

Returns:
(undocumented)

getThreshold

public double getThreshold()

setThreshold

public Binarizer setThreshold(double value)

setInputCol

public Binarizer setInputCol(String value)

setOutputCol

public Binarizer setOutputCol(String value)

transform

public DataFrame transform(DataFrame dataset)
Description copied from class: Transformer
Transforms the input dataset.

Specified by:
transform in class Transformer
Parameters:
dataset - (undocumented)
Returns:
(undocumented)

transformSchema

public StructType transformSchema(StructType schema)
Description copied from class: PipelineStage
:: DeveloperApi ::

Derives the output schema from the input schema.

Specified by:
transformSchema in class PipelineStage
Parameters:
schema - (undocumented)
Returns:
(undocumented)

copy

public Binarizer copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class Transformer
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()