org.apache.spark.mllib.feature
Class Word2VecModel

Object
  extended by org.apache.spark.mllib.feature.Word2VecModel
All Implemented Interfaces:
java.io.Serializable, Saveable

public class Word2VecModel
extends Object
implements scala.Serializable, Saveable

:: Experimental :: Word2Vec model

See Also:
Serialized Form

Method Summary
 scala.Tuple2<String,Object>[] findSynonyms(String word, int num)
          Find synonyms of a word
 scala.Tuple2<String,Object>[] findSynonyms(Vector vector, int num)
          Find synonyms of the vector representation of a word
 scala.collection.immutable.Map<String,float[]> getVectors()
          Returns a map of words to their vector representations.
static Word2VecModel load(SparkContext sc, String path)
           
 void save(SparkContext sc, String path)
          Save this model to the given path.
 Vector transform(String word)
          Transforms a word to its vector representation
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

load

public static Word2VecModel load(SparkContext sc,
                                 String path)

save

public void save(SparkContext sc,
                 String path)
Description copied from interface: Saveable
Save this model to the given path.

This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/

The model may be loaded using Loader.load.

Specified by:
save in interface Saveable
Parameters:
sc - Spark context used to save model data.
path - Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.

transform

public Vector transform(String word)
Transforms a word to its vector representation

Parameters:
word - a word
Returns:
vector representation of word

findSynonyms

public scala.Tuple2<String,Object>[] findSynonyms(String word,
                                                  int num)
Find synonyms of a word

Parameters:
word - a word
num - number of synonyms to find
Returns:
array of (word, cosineSimilarity)

findSynonyms

public scala.Tuple2<String,Object>[] findSynonyms(Vector vector,
                                                  int num)
Find synonyms of the vector representation of a word

Parameters:
vector - vector representation of a word
num - number of synonyms to find
Returns:
array of (word, cosineSimilarity)

getVectors

public scala.collection.immutable.Map<String,float[]> getVectors()
Returns a map of words to their vector representations.

Returns:
(undocumented)