Package org.apache.spark.mllib.feature
package org.apache.spark.mllib.feature
-
ClassDescriptionCreates a ChiSquared feature selector.Chi Squared selector model.Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.Maps a sequence of terms to their term frequencies using the hashing trick.Inverse document frequency (IDF).Document frequency aggregator.Represents an IDF model that can transform term frequency vectors.Normalizes samples individually to unit L^p^ normA feature transformer that projects vectors to a low-dimensional space using PCA.Model fitted by
PCA
that can project vectors to a low-dimensional space using PCA.Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.Represents a StandardScaler model that can transform vectors.Trait for transformation of a vectorEntry in vocabularyWord2Vec creates vector representation of words in a text corpus.Word2Vec model param: wordIndex maps each word to an index, which can retrieve the corresponding vector from wordVectors param: wordVectors array of length numWords * vectorSize, vector corresponding to the word mapped with index i can be retrieved by the slice (i * vectorSize, i * vectorSize + vectorSize)