org.apache.spark.ml
Class Transformer

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Transformer
All Implemented Interfaces:
java.io.Serializable, Logging, Params
Direct Known Subclasses:
Binarizer, HashingTF, Model, OneHotEncoder, UnaryTransformer, VectorAssembler

public abstract class Transformer
extends PipelineStage

:: DeveloperApi :: Abstract class for transformers that transform one dataset into another.

See Also:
Serialized Form

Constructor Summary
Transformer()
           
 
Method Summary
abstract  Transformer copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
abstract  DataFrame transform(DataFrame dataset)
          Transforms the input dataset.
 DataFrame transform(DataFrame dataset, ParamMap paramMap)
          Transforms the dataset with provided parameter map as additional parameters.
 DataFrame transform(DataFrame dataset, ParamPair<?> firstParamPair, ParamPair<?>... otherParamPairs)
          Transforms the dataset with optional parameters
 DataFrame transform(DataFrame dataset, ParamPair<?> firstParamPair, scala.collection.Seq<ParamPair<?>> otherParamPairs)
          Transforms the dataset with optional parameters
 
Methods inherited from class org.apache.spark.ml.PipelineStage
transformSchema
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

Transformer

public Transformer()
Method Detail

transform

public DataFrame transform(DataFrame dataset,
                           ParamPair<?> firstParamPair,
                           ParamPair<?>... otherParamPairs)
Transforms the dataset with optional parameters

Parameters:
dataset - input dataset
firstParamPair - the first param pair, overwrite embedded params
otherParamPairs - other param pairs, overwrite embedded params
Returns:
transformed dataset

transform

public DataFrame transform(DataFrame dataset,
                           ParamPair<?> firstParamPair,
                           scala.collection.Seq<ParamPair<?>> otherParamPairs)
Transforms the dataset with optional parameters

Parameters:
dataset - input dataset
firstParamPair - the first param pair, overwrite embedded params
otherParamPairs - other param pairs, overwrite embedded params
Returns:
transformed dataset

transform

public DataFrame transform(DataFrame dataset,
                           ParamMap paramMap)
Transforms the dataset with provided parameter map as additional parameters.

Parameters:
dataset - input dataset
paramMap - additional parameters, overwrite embedded params
Returns:
transformed dataset

transform

public abstract DataFrame transform(DataFrame dataset)
Transforms the input dataset.

Parameters:
dataset - (undocumented)
Returns:
(undocumented)

copy

public abstract Transformer copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class PipelineStage
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()