org.apache.spark.ml.feature
Class Word2VecModel

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Transformer
          extended by org.apache.spark.ml.Model<Word2VecModel>
              extended by org.apache.spark.ml.feature.Word2VecModel
All Implemented Interfaces:
java.io.Serializable, Logging, Params

public class Word2VecModel
extends Model<Word2VecModel>

:: Experimental :: Model fitted by Word2Vec.

See Also:
Serialized Form

Method Summary
 Word2VecModel copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
 int getMinCount()
           
 int getNumPartitions()
           
 int getVectorSize()
           
 IntParam minCount()
          The minimum number of times a token must appear to be included in the word2vec model's vocabulary.
 IntParam numPartitions()
          Number of partitions for sentences of words.
 Word2VecModel setInputCol(String value)
           
 Word2VecModel setOutputCol(String value)
           
 DataFrame transform(DataFrame dataset)
          Transform a sentence column to a vector column to represent the whole sentence.
 StructType transformSchema(StructType schema)
          :: DeveloperApi ::
 String uid()
           
 StructType validateAndTransformSchema(StructType schema)
          Validate and transform the input schema.
 IntParam vectorSize()
          The dimension of the code that you want to transform from words.
 
Methods inherited from class org.apache.spark.ml.Model
hasParent, parent, setParent
 
Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Method Detail

uid

public String uid()

setInputCol

public Word2VecModel setInputCol(String value)

setOutputCol

public Word2VecModel setOutputCol(String value)

transform

public DataFrame transform(DataFrame dataset)
Transform a sentence column to a vector column to represent the whole sentence. The transform is performed by averaging all word vectors it contains.

Specified by:
transform in class Transformer
Parameters:
dataset - (undocumented)
Returns:
(undocumented)

transformSchema

public StructType transformSchema(StructType schema)
Description copied from class: PipelineStage
:: DeveloperApi ::

Derives the output schema from the input schema.

Specified by:
transformSchema in class PipelineStage
Parameters:
schema - (undocumented)
Returns:
(undocumented)

copy

public Word2VecModel copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class Model<Word2VecModel>
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()

vectorSize

public IntParam vectorSize()
The dimension of the code that you want to transform from words.

Returns:
(undocumented)

getVectorSize

public int getVectorSize()

numPartitions

public IntParam numPartitions()
Number of partitions for sentences of words.

Returns:
(undocumented)

getNumPartitions

public int getNumPartitions()

minCount

public IntParam minCount()
The minimum number of times a token must appear to be included in the word2vec model's vocabulary.

Returns:
(undocumented)

getMinCount

public int getMinCount()

validateAndTransformSchema

public StructType validateAndTransformSchema(StructType schema)
Validate and transform the input schema.

Parameters:
schema - (undocumented)
Returns:
(undocumented)