Package org.apache.spark.mllib.feature
Class HashingTF
Object
org.apache.spark.mllib.feature.HashingTF
- All Implemented Interfaces:
- Serializable
Maps a sequence of terms to their term frequencies using the hashing trick.
 
param: numFeatures number of features (default: 2^20^)
- See Also:
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionintReturns the index of the input term.intsetBinary(boolean value) If true, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: false)setHashAlgorithm(String value) Set the hash algorithm used when mapping term to integer.Transforms the input document into a sparse term frequency vector (Java version).Transforms the input document to term frequency vectors (Java version).Transforms the input document to term frequency vectors.Transforms the input document into a sparse term frequency vector.
- 
Constructor Details- 
HashingTFpublic HashingTF(int numFeatures) 
- 
HashingTFpublic HashingTF()
 
- 
- 
Method Details- 
numFeaturespublic int numFeatures()
- 
setBinaryIf true, term frequency vector will be binary such that non-zero term counts will be set to 1 (default: false)- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
 
- 
setHashAlgorithmSet the hash algorithm used when mapping term to integer. (default: murmur3)- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
 
- 
indexOfReturns the index of the input term.- Parameters:
- term- (undocumented)
- Returns:
- (undocumented)
 
- 
transformTransforms the input document into a sparse term frequency vector.- Parameters:
- document- (undocumented)
- Returns:
- (undocumented)
 
- 
transformTransforms the input document into a sparse term frequency vector (Java version).- Parameters:
- document- (undocumented)
- Returns:
- (undocumented)
 
- 
transformTransforms the input document to term frequency vectors.- Parameters:
- dataset- (undocumented)
- Returns:
- (undocumented)
 
- 
transformTransforms the input document to term frequency vectors (Java version).- Parameters:
- dataset- (undocumented)
- Returns:
- (undocumented)
 
 
-