org.apache.spark.mllib.feature
Class IDFModel

Object
  extended by org.apache.spark.mllib.feature.IDFModel
All Implemented Interfaces:
java.io.Serializable

public class IDFModel
extends Object
implements scala.Serializable

:: Experimental :: Represents an IDF model that can transform term frequency vectors.

See Also:
Serialized Form

Method Summary
 Vector idf()
           
 JavaRDD<Vector> transform(JavaRDD<Vector> dataset)
          Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).
 RDD<Vector> transform(RDD<Vector> dataset)
          Transforms term frequency (TF) vectors to TF-IDF vectors.
 Vector transform(Vector v)
          Transforms a term frequency (TF) vector to a TF-IDF vector
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

idf

public Vector idf()

transform

public RDD<Vector> transform(RDD<Vector> dataset)
Transforms term frequency (TF) vectors to TF-IDF vectors.

If minDocFreq was set for the IDF calculation, the terms which occur in fewer than minDocFreq documents will have an entry of 0.

Parameters:
dataset - an RDD of term frequency vectors
Returns:
an RDD of TF-IDF vectors

transform

public Vector transform(Vector v)
Transforms a term frequency (TF) vector to a TF-IDF vector

Parameters:
v - a term frequency vector
Returns:
a TF-IDF vector

transform

public JavaRDD<Vector> transform(JavaRDD<Vector> dataset)
Transforms term frequency (TF) vectors to TF-IDF vectors (Java version).

Parameters:
dataset - a JavaRDD of term frequency vectors
Returns:
a JavaRDD of TF-IDF vectors