Class | Description |
---|---|
Binarizer |
:: Experimental ::
Binarize a column of continuous features given a threshold.
|
Bucketizer |
:: Experimental ::
Bucketizer maps a column of continuous features to a column of feature buckets. |
ColumnPruner |
Utility transformer for removing temporary columns from a DataFrame.
|
CountVectorizer |
:: Experimental ::
Extracts a vocabulary from document collections and generates a
CountVectorizerModel . |
CountVectorizerModel |
:: Experimental ::
Converts a text document to a sparse vector of token counts.
|
DCT |
:: Experimental ::
A feature transformer that takes the 1D discrete cosine transform of a real vector.
|
ElementwiseProduct |
:: Experimental ::
Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a
provided "weight" vector.
|
HashingTF |
:: Experimental ::
Maps a sequence of terms to their term frequencies using the hashing trick.
|
IDF |
:: Experimental ::
Compute the Inverse Document Frequency (IDF) given a collection of documents.
|
IDFModel | |
IndexToString |
:: Experimental ::
A
Transformer that maps a column of string indices back to a new column of corresponding
string values using either the ML attributes of the input column, or if provided using the labels
supplied by the user. |
MinMaxScaler |
:: Experimental ::
Rescale each feature individually to a common range [min, max] linearly using column summary
statistics, which is also known as min-max normalization or Rescaling.
|
MinMaxScalerModel | |
NGram |
:: Experimental ::
A feature transformer that converts the input array of strings into an array of n-grams.
|
Normalizer |
:: Experimental ::
Normalize a vector to have unit norm using the given p-norm.
|
OneHotEncoder |
:: Experimental ::
A one-hot encoder that maps a column of category indices to a column of binary vectors, with
at most a single one-value per row that indicates the input category index.
|
PCA |
:: Experimental ::
PCA trains a model to project vectors to a low-dimensional space using PCA.
|
PCAModel | |
PolynomialExpansion |
:: Experimental ::
Perform feature expansion in a polynomial space.
|
RegexTokenizer |
:: Experimental ::
A regex based tokenizer that extracts tokens either by using the provided regex pattern to split
the text (default) or repeatedly matching the regex (if
gaps is false). |
RFormula |
:: Experimental ::
Implements the transforms required for fitting a dataset against an R model formula.
|
RFormulaModel |
:: Experimental ::
A fitted RFormula.
|
StandardScaler |
:: Experimental ::
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
|
StandardScalerModel | |
StopWords |
stop words list
|
StopWordsRemover |
:: Experimental ::
A feature transformer that filters out stop words from input.
|
StringIndexer |
:: Experimental ::
A label indexer that maps a string column of labels to an ML column of label indices.
|
StringIndexerModel |
:: Experimental ::
Model fitted by
StringIndexer . |
Tokenizer |
:: Experimental ::
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
|
VectorAssembler |
:: Experimental ::
A feature transformer that merges multiple columns into a vector column.
|
VectorIndexer |
:: Experimental ::
Class for indexing categorical feature columns in a dataset of
Vector . |
VectorIndexer.CategoryStats |
Helper class for tracking unique values for each feature.
|
VectorIndexerModel |
:: Experimental ::
Transform categorical features to use 0-based indices instead of their original values.
|
VectorSlicer |
:: Experimental ::
This class takes a feature vector and outputs a new feature vector with a subarray of the
original features.
|
Word2Vec |
:: Experimental ::
Word2Vec trains a model of
Map(String, Vector) , i.e. |
Word2VecModel |
:: Experimental ::
Model fitted by
Word2Vec . |