org.apache.spark.ml.feature
Class IDF

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Estimator<IDFModel>
          extended by org.apache.spark.ml.feature.IDF
All Implemented Interfaces:
java.io.Serializable, Logging, Params

public final class IDF
extends Estimator<IDFModel>

:: Experimental :: Compute the Inverse Document Frequency (IDF) given a collection of documents.

See Also:
Serialized Form

Constructor Summary
IDF()
           
IDF(String uid)
           
 
Method Summary
 IDF copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
 IDFModel fit(DataFrame dataset)
          Fits a model to the input data.
 int getMinDocFreq()
           
 IntParam minDocFreq()
          The minimum of documents in which a term should appear.
 IDF setInputCol(String value)
           
 IDF setMinDocFreq(int value)
           
 IDF setOutputCol(String value)
           
 StructType transformSchema(StructType schema)
          :: DeveloperApi ::
 String uid()
           
 StructType validateAndTransformSchema(StructType schema)
          Validate and transform the input schema.
 
Methods inherited from class org.apache.spark.ml.Estimator
fit, fit, fit, fit
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

IDF

public IDF(String uid)

IDF

public IDF()
Method Detail

uid

public String uid()

setInputCol

public IDF setInputCol(String value)

setOutputCol

public IDF setOutputCol(String value)

setMinDocFreq

public IDF setMinDocFreq(int value)

fit

public IDFModel fit(DataFrame dataset)
Description copied from class: Estimator
Fits a model to the input data.

Specified by:
fit in class Estimator<IDFModel>
Parameters:
dataset - (undocumented)
Returns:
(undocumented)

transformSchema

public StructType transformSchema(StructType schema)
Description copied from class: PipelineStage
:: DeveloperApi ::

Derives the output schema from the input schema.

Specified by:
transformSchema in class PipelineStage
Parameters:
schema - (undocumented)
Returns:
(undocumented)

copy

public IDF copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class Estimator<IDFModel>
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()

minDocFreq

public IntParam minDocFreq()
The minimum of documents in which a term should appear.

Returns:
(undocumented)

getMinDocFreq

public int getMinDocFreq()

validateAndTransformSchema

public StructType validateAndTransformSchema(StructType schema)
Validate and transform the input schema.

Parameters:
schema - (undocumented)
Returns:
(undocumented)