feature

Type Members

final class Binarizer extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

Binarize a column of continuous features given a threshold.
Binarize a column of continuous features given a threshold.

Annotations
@Since( "1.4.0" )
final class Bucketizer extends Model[Bucketizer] with HasInputCol with HasOutputCol with DefaultParamsWritable

Bucketizer maps a column of continuous features to a column of feature buckets.
Bucketizer maps a column of continuous features to a column of feature buckets.

Annotations
@Since( "1.4.0" )
final class ChiSqSelector extends Estimator[ChiSqSelectorModel] with ChiSqSelectorParams with DefaultParamsWritable

Chi-Squared feature selection, which selects categorical features to use for predicting a categorical label.
Chi-Squared feature selection, which selects categorical features to use for predicting a categorical label.

Annotations
@Since( "1.6.0" )
final class ChiSqSelectorModel extends Model[ChiSqSelectorModel] with ChiSqSelectorParams with MLWritable

Model fitted by ChiSqSelector.
Model fitted by ChiSqSelector.

Annotations
@Since( "1.6.0" )
class CountVectorizer extends Estimator[CountVectorizerModel] with CountVectorizerParams with DefaultParamsWritable

Extracts a vocabulary from document collections and generates a CountVectorizerModel.
Extracts a vocabulary from document collections and generates a CountVectorizerModel.

Annotations
@Since( "1.5.0" )
class CountVectorizerModel extends Model[CountVectorizerModel] with CountVectorizerParams with MLWritable

Converts a text document to a sparse vector of token counts.
Converts a text document to a sparse vector of token counts.

Annotations
@Since( "1.5.0" )
class DCT extends UnaryTransformer[Vector, Vector, DCT] with DefaultParamsWritable

A feature transformer that takes the 1D discrete cosine transform of a real vector.
A feature transformer that takes the 1D discrete cosine transform of a real vector. No zero padding is performed on the input vector. It returns a real vector of the same length representing the DCT. The return vector is scaled such that the transform matrix is unitary (aka scaled DCT-II).
More information on Wikipedia.

Annotations
@Since( "1.5.0" )
class ElementwiseProduct extends UnaryTransformer[Vector, Vector, ElementwiseProduct] with DefaultParamsWritable

Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector. In other words, it scales each column of the dataset by a scalar multiplier.

Annotations
@Since( "1.4.0" )
class HashingTF extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

Maps a sequence of terms to their term frequencies using the hashing trick.
Maps a sequence of terms to their term frequencies using the hashing trick. Currently we use Austin Appleby's MurmurHash 3 algorithm (MurmurHash3_x86_32) to calculate the hash code value for the term object. Since a simple modulo is used to transform the hash function to a column index, it is advisable to use a power of two as the numFeatures parameter; otherwise the features will not be mapped evenly to the columns.

Annotations
@Since( "1.2.0" )
final class IDF extends Estimator[IDFModel] with IDFBase with DefaultParamsWritable

Compute the Inverse Document Frequency (IDF) given a collection of documents.
Compute the Inverse Document Frequency (IDF) given a collection of documents.

Annotations
@Since( "1.4.0" )
class IDFModel extends Model[IDFModel] with IDFBase with MLWritable

Model fitted by IDF.
Model fitted by IDF.

Annotations
@Since( "1.4.0" )
class IndexToString extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

A Transformer that maps a column of indices back to a new column of corresponding string values.
A Transformer that maps a column of indices back to a new column of corresponding string values. The index-string mapping is either from the ML attributes of the input column, or from user-supplied labels (which take precedence over ML attributes).

Annotations
@Since( "1.5.0" )
See also
StringIndexer for converting strings into indices
class Interaction extends Transformer with HasInputCols with HasOutputCol with DefaultParamsWritable

Implements the feature interaction transform.
Implements the feature interaction transform. This transformer takes in Double and Vector type columns and outputs a flattened vector of their feature interactions. To handle interaction, we first one-hot encode any nominal features. Then, a vector of the feature cross-products is produced.
For example, given the input feature values Double(2) and Vector(3, 4), the output would be Vector(6, 8) if all input features were numeric. If the first feature was instead nominal with four categories, the output would then be Vector(0, 0, 0, 0, 3, 4, 0, 0).

Annotations
@Since( "1.6.0" )
case class LabeledPoint(label: Double, features: Vector) extends Product with Serializable

:: Experimental ::
:: Experimental ::
Class that represents the features and labels of a data point.
label
Label for this data point.
features
List of features for this data point.

Annotations
@Since( "2.0.0" ) @Experimental() @BeanInfo()
class MaxAbsScaler extends Estimator[MaxAbsScalerModel] with MaxAbsScalerParams with DefaultParamsWritable

:: Experimental :: Rescale each feature individually to range [-1, 1] by dividing through the largest maximum absolute value in each feature.
:: Experimental :: Rescale each feature individually to range [-1, 1] by dividing through the largest maximum absolute value in each feature. It does not shift/center the data, and thus does not destroy any sparsity.

Annotations
@Experimental() @Since( "2.0.0" )
class MaxAbsScalerModel extends Model[MaxAbsScalerModel] with MaxAbsScalerParams with MLWritable

:: Experimental :: Model fitted by MaxAbsScaler.
:: Experimental :: Model fitted by MaxAbsScaler.

Annotations
@Experimental() @Since( "2.0.0" )
class MinMaxScaler extends Estimator[MinMaxScalerModel] with MinMaxScalerParams with DefaultParamsWritable

Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling.
Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling. The rescaled value for feature E is calculated as,
Rescaled(e_i) = \frac{e_i - E_{min}}{E_{max} - E_{min}} * (max - min) + min
For the case E_{max} == E_{min}, Rescaled(e_i) = 0.5 * (max + min). Note that since zero values will probably be transformed to non-zero values, output of the transformer will be DenseVector even for sparse input.

Annotations
@Since( "1.5.0" )
class MinMaxScalerModel extends Model[MinMaxScalerModel] with MinMaxScalerParams with MLWritable

Model fitted by MinMaxScaler.
Model fitted by MinMaxScaler.

Annotations
@Since( "1.5.0" )
class NGram extends UnaryTransformer[Seq[String], Seq[String], NGram] with DefaultParamsWritable

A feature transformer that converts the input array of strings into an array of n-grams.
A feature transformer that converts the input array of strings into an array of n-grams. Null values in the input array are ignored. It returns an array of n-grams where each n-gram is represented by a space-separated string of words.
When the input is empty, an empty array is returned. When the input array length is less than n (number of elements per n-gram), no n-grams are returned.

Annotations
@Since( "1.5.0" )
class Normalizer extends UnaryTransformer[Vector, Vector, Normalizer] with DefaultParamsWritable

Normalize a vector to have unit norm using the given p-norm.
Normalize a vector to have unit norm using the given p-norm.

Annotations
@Since( "1.4.0" )
class OneHotEncoder extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index.
A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index. For example with 5 categories, an input value of 2.0 would map to an output vector of [0.0, 0.0, 1.0, 0.0]. The last category is not included by default (configurable via OneHotEncoder!.dropLast because it makes the vector entries sum up to one, and hence linearly dependent. So an input value of 4.0 maps to [0.0, 0.0, 0.0, 0.0]. Note that this is different from scikit-learn's OneHotEncoder, which keeps all categories. The output vectors are sparse.

Annotations
@Since( "1.4.0" )
See also
StringIndexer for converting categorical values into category indices
class PCA extends Estimator[PCAModel] with PCAParams with DefaultParamsWritable

PCA trains a model to project vectors to a lower dimensional space of the top PCA!.k principal components.
PCA trains a model to project vectors to a lower dimensional space of the top PCA!.k principal components.

Annotations
@Since( "1.5.0" )
class PCAModel extends Model[PCAModel] with PCAParams with MLWritable

Model fitted by PCA.
Model fitted by PCA. Transforms vectors to a lower dimensional space.

Annotations
@Since( "1.5.0" )
class PolynomialExpansion extends UnaryTransformer[Vector, Vector, PolynomialExpansion] with DefaultParamsWritable

Perform feature expansion in a polynomial space.
Perform feature expansion in a polynomial space. As said in wikipedia of Polynomial Expansion, which is available at http://en.wikipedia.org/wiki/Polynomial_expansion, "In mathematics, an expansion of a product of sums expresses it as a sum of products by using the fact that multiplication distributes over addition". Take a 2-variable feature vector as an example: (x, y), if we want to expand it with degree 2, then we get (x, x * x, y, x * y, y * y).

Annotations
@Since( "1.4.0" )
final class QuantileDiscretizer extends Estimator[Bucketizer] with QuantileDiscretizerBase with DefaultParamsWritable

QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features.
QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features. The number of bins can be set using the numBuckets parameter. The bin ranges are chosen using an approximate algorithm (see the documentation for approxQuantile for a detailed description). The precision of the approximation can be controlled with the relativeError parameter. The lower and upper bin bounds will be -Infinity and +Infinity, covering all real values.

Annotations
@Since( "1.6.0" )
class RFormula extends Estimator[RFormulaModel] with RFormulaBase with DefaultParamsWritable

:: Experimental :: Implements the transforms required for fitting a dataset against an R model formula.
:: Experimental :: Implements the transforms required for fitting a dataset against an R model formula. Currently we support a limited subset of the R operators, including '~', '.', ':', '+', and '-'. Also see the R formula docs here: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/formula.html
The basic operators are:
- ~ separate target and terms
- + concat terms, "+ 0" means removing intercept
- - remove a term, "- 1" means removing intercept
- : interaction (multiplication for numeric values, or binarized categorical values)
- . all columns except target
Suppose a and b are double columns, we use the following simple examples to illustrate the effect of RFormula:
- y ~ a + b means model y ~ w0 + w1 * a + w2 * b where w0 is the intercept and w1, w2 are coefficients.
- y ~ a + b + a:b - 1 means model y ~ w1 * a + w2 * b + w3 * a * b where w1, w2, w3 are coefficients.
RFormula produces a vector column of features and a double or string column of label. Like when formulas are used in R for linear regression, string input columns will be one-hot encoded, and numeric columns will be cast to doubles. If the label column is of type string, it will be first transformed to double with StringIndexer. If the label column does not exist in the DataFrame, the output label column will be created from the specified response variable in the formula.
Annotations
@Experimental() @Since( "1.5.0" )
class RFormulaModel extends Model[RFormulaModel] with RFormulaBase with MLWritable

:: Experimental :: Model fitted by RFormula.
:: Experimental :: Model fitted by RFormula. Fitting is required to determine the factor levels of formula terms.

Annotations
@Experimental() @Since( "1.5.0" )
class RegexTokenizer extends UnaryTransformer[String, Seq[String], RegexTokenizer] with DefaultParamsWritable

A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false).
A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false). Optional parameters also allow filtering tokens using a minimal length. It returns an array of strings that can be empty.

Annotations
@Since( "1.4.0" )
class SQLTransformer extends Transformer with DefaultParamsWritable

Implements the transformations which are defined by SQL statement.
Implements the transformations which are defined by SQL statement. Currently we only support SQL syntax like 'SELECT ... FROM THIS ...' where 'THIS' represents the underlying table of the input dataset. The select clause specifies the fields, constants, and expressions to display in the output, it can be any select clause that Spark SQL supports. Users can also use Spark SQL built-in function and UDFs to operate on these selected columns. For example, SQLTransformer supports statements like:
- SELECT a, a + b AS a_b FROM THIS
- SELECT a, SQRT(b) AS b_sqrt FROM THIS where a > 5
- SELECT a, b, SUM(c) AS c_sum FROM THIS GROUP BY a, b
Annotations
@Since( "1.6.0" )
class StandardScaler extends Estimator[StandardScalerModel] with StandardScalerParams with DefaultParamsWritable

Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
The "unit std" is computed using the corrected sample standard deviation, which is computed as the square root of the unbiased sample variance.

Annotations
@Since( "1.2.0" )
class StandardScalerModel extends Model[StandardScalerModel] with StandardScalerParams with MLWritable

Model fitted by StandardScaler.
Model fitted by StandardScaler.

Annotations
@Since( "1.2.0" )
class StopWordsRemover extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

A feature transformer that filters out stop words from input.
A feature transformer that filters out stop words from input. Note: null values from input array are preserved unless adding null to stopWords explicitly.

Annotations
@Since( "1.5.0" )
See also
http://en.wikipedia.org/wiki/Stop_words
class StringIndexer extends Estimator[StringIndexerModel] with StringIndexerBase with DefaultParamsWritable

A label indexer that maps a string column of labels to an ML column of label indices.
A label indexer that maps a string column of labels to an ML column of label indices. If the input column is numeric, we cast it to string and index the string values. The indices are in [0, numLabels), ordered by label frequencies. So the most frequent label gets index 0.

Annotations
@Since( "1.4.0" )
See also
IndexToString for the inverse transformation
class StringIndexerModel extends Model[StringIndexerModel] with StringIndexerBase with MLWritable

Model fitted by StringIndexer.
Model fitted by StringIndexer.
NOTE: During transformation, if the input column does not exist, StringIndexerModel.transform would return the input dataset unmodified. This is a temporary fix for the case when target labels do not exist during prediction.

Annotations
@Since( "1.4.0" )
class Tokenizer extends UnaryTransformer[String, Seq[String], Tokenizer] with DefaultParamsWritable

A tokenizer that converts the input string to lowercase and then splits it by white spaces.
A tokenizer that converts the input string to lowercase and then splits it by white spaces.

Annotations
@Since( "1.2.0" )
See also
RegexTokenizer
class VectorAssembler extends Transformer with HasInputCols with HasOutputCol with DefaultParamsWritable

A feature transformer that merges multiple columns into a vector column.
A feature transformer that merges multiple columns into a vector column.

Annotations
@Since( "1.4.0" )
class VectorIndexer extends Estimator[VectorIndexerModel] with VectorIndexerParams with DefaultParamsWritable

Class for indexing categorical feature columns in a dataset of Vector.
Class for indexing categorical feature columns in a dataset of Vector.
This has 2 usage modes:
- Automatically identify categorical features (default behavior)
  - This helps process a dataset of unknown vectors into a dataset with some continuous features and some categorical features. The choice between continuous and categorical is based upon a maxCategories parameter.
  - Set maxCategories to the maximum number of categorical any categorical feature should have.
  - E.g.: Feature 0 has unique values {-1.0, 0.0}, and feature 1 values {1.0, 3.0, 5.0}. If maxCategories = 2, then feature 0 will be declared categorical and use indices {0, 1}, and feature 1 will be declared continuous.
- Index all features, if all features are categorical
  - If maxCategories is set to be very large, then this will build an index of unique values for all features.
  - Warning: This can cause problems if features are continuous since this will collect ALL unique values to the driver.
  - E.g.: Feature 0 has unique values {-1.0, 0.0}, and feature 1 values {1.0, 3.0, 5.0}. If maxCategories >= 3, then both features will be declared categorical.
This returns a model which can transform categorical features to use 0-based indices.
Index stability:
- This is not guaranteed to choose the same category index across multiple runs.
- If a categorical feature includes value 0, then this is guaranteed to map value 0 to index 0. This maintains vector sparsity.
- More stability may be added in the future.
TODO: Future extensions: The following functionality is planned for the future:
- Preserve metadata in transform; if a feature's metadata is already present, do not recompute.
- Specify certain features to not index, either via a parameter or via existing metadata.
- Add warning if a categorical feature has only 1 category.
- Add option for allowing unknown categories.
Annotations
@Since( "1.4.0" )
class VectorIndexerModel extends Model[VectorIndexerModel] with VectorIndexerParams with MLWritable

Model fitted by VectorIndexer.
Model fitted by VectorIndexer. Transform categorical features to use 0-based indices instead of their original values.
- Categorical features are mapped to indices.
- Continuous features (columns) are left unchanged. This also appends metadata to the output column, marking features as Numeric (continuous), Nominal (categorical), or Binary (either continuous or categorical). Non-ML metadata is not carried over from the input to the output column.
This maintains vector sparsity.
Annotations
@Since( "1.4.0" )
final class VectorSlicer extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

This class takes a feature vector and outputs a new feature vector with a subarray of the original features.
This class takes a feature vector and outputs a new feature vector with a subarray of the original features.
The subset of features can be specified with either indices (setIndices()) or names (setNames()). At least one feature must be selected. Duplicate features are not allowed, so there can be no overlap between selected indices and names.
The output vector will order features with the selected indices first (in the order given), followed by the selected names (in the order given).

Annotations
@Since( "1.5.0" )
final class Word2Vec extends Estimator[Word2VecModel] with Word2VecBase with DefaultParamsWritable

Word2Vec trains a model of Map(String, Vector), i.e.
Word2Vec trains a model of Map(String, Vector), i.e. transforms a word into a code for further natural language processing or machine learning process.

Annotations
@Since( "1.4.0" )
class Word2VecModel extends Model[Word2VecModel] with Word2VecBase with MLWritable

Model fitted by Word2Vec.
Model fitted by Word2Vec.

Annotations
@Since( "1.4.0" )

Value Members

object Binarizer extends DefaultParamsReadable[Binarizer] with Serializable

Annotations
@Since( "1.6.0" )
object Bucketizer extends DefaultParamsReadable[Bucketizer] with Serializable

Annotations
@Since( "1.6.0" )
object ChiSqSelector extends DefaultParamsReadable[ChiSqSelector] with Serializable

Annotations
@Since( "1.6.0" )
object ChiSqSelectorModel extends MLReadable[ChiSqSelectorModel] with Serializable

Annotations
@Since( "1.6.0" )
object CountVectorizer extends DefaultParamsReadable[CountVectorizer] with Serializable

Annotations
@Since( "1.6.0" )
object CountVectorizerModel extends MLReadable[CountVectorizerModel] with Serializable

Annotations
@Since( "1.6.0" )
object DCT extends DefaultParamsReadable[DCT] with Serializable

Annotations
@Since( "1.6.0" )
object ElementwiseProduct extends DefaultParamsReadable[ElementwiseProduct] with Serializable

Annotations
@Since( "2.0.0" )
object HashingTF extends DefaultParamsReadable[HashingTF] with Serializable

Annotations
@Since( "1.6.0" )
object IDF extends DefaultParamsReadable[IDF] with Serializable

Annotations
@Since( "1.6.0" )
object IDFModel extends MLReadable[IDFModel] with Serializable

Annotations
@Since( "1.6.0" )
object IndexToString extends DefaultParamsReadable[IndexToString] with Serializable

Annotations
@Since( "1.6.0" )
object Interaction extends DefaultParamsReadable[Interaction] with Serializable

Annotations
@Since( "1.6.0" )
object MaxAbsScaler extends DefaultParamsReadable[MaxAbsScaler] with Serializable

Annotations
@Since( "2.0.0" )
object MaxAbsScalerModel extends MLReadable[MaxAbsScalerModel] with Serializable

Annotations
@Since( "2.0.0" )
object MinMaxScaler extends DefaultParamsReadable[MinMaxScaler] with Serializable

Annotations
@Since( "1.6.0" )
object MinMaxScalerModel extends MLReadable[MinMaxScalerModel] with Serializable

Annotations
@Since( "1.6.0" )
object NGram extends DefaultParamsReadable[NGram] with Serializable

Annotations
@Since( "1.6.0" )
object Normalizer extends DefaultParamsReadable[Normalizer] with Serializable

Annotations
@Since( "1.6.0" )
object OneHotEncoder extends DefaultParamsReadable[OneHotEncoder] with Serializable

Annotations
@Since( "1.6.0" )
object PCA extends DefaultParamsReadable[PCA] with Serializable

Annotations
@Since( "1.6.0" )
object PCAModel extends MLReadable[PCAModel] with Serializable

Annotations
@Since( "1.6.0" )
object PolynomialExpansion extends DefaultParamsReadable[PolynomialExpansion] with Serializable

The expansion is done via recursion.
The expansion is done via recursion. Given n features and degree d, the size after expansion is (n + d choose d) (including 1 and first-order values). For example, let f([a, b, c], 3) be the function that expands [a, b, c] to their monomials of degree 3. We have the following recursion:
```
f([a, b, c], 3) = f([a, b], 3) ++ f([a, b], 2) * c ++ f([a, b], 1) * c^2 ++ [c^3]
```
To handle sparsity, if c is zero, we can skip all monomials that contain it. We remember the current index and increment it properly for sparse input.
Annotations
@Since( "1.6.0" )
object QuantileDiscretizer extends DefaultParamsReadable[QuantileDiscretizer] with Logging with Serializable

Annotations
@Since( "1.6.0" )
object RFormula extends DefaultParamsReadable[RFormula] with Serializable

Annotations
@Since( "2.0.0" )
object RFormulaModel extends MLReadable[RFormulaModel] with Serializable

Annotations
@Since( "2.0.0" )
object RegexTokenizer extends DefaultParamsReadable[RegexTokenizer] with Serializable

Annotations
@Since( "1.6.0" )
object SQLTransformer extends DefaultParamsReadable[SQLTransformer] with Serializable

Annotations
@Since( "1.6.0" )
object StandardScaler extends DefaultParamsReadable[StandardScaler] with Serializable

Annotations
@Since( "1.6.0" )
object StandardScalerModel extends MLReadable[StandardScalerModel] with Serializable

Annotations
@Since( "1.6.0" )
object StopWordsRemover extends DefaultParamsReadable[StopWordsRemover] with Serializable

Annotations
@Since( "1.6.0" )
object StringIndexer extends DefaultParamsReadable[StringIndexer] with Serializable

Annotations
@Since( "1.6.0" )
object StringIndexerModel extends MLReadable[StringIndexerModel] with Serializable

Annotations
@Since( "1.6.0" )
object Tokenizer extends DefaultParamsReadable[Tokenizer] with Serializable

Annotations
@Since( "1.6.0" )
object VectorAssembler extends DefaultParamsReadable[VectorAssembler] with Serializable

Annotations
@Since( "1.6.0" )
object VectorIndexer extends DefaultParamsReadable[VectorIndexer] with Serializable

Annotations
@Since( "1.6.0" )
object VectorIndexerModel extends MLReadable[VectorIndexerModel] with Serializable

Annotations
@Since( "1.6.0" )
object VectorSlicer extends DefaultParamsReadable[VectorSlicer] with Serializable

Annotations
@Since( "1.6.0" )
object Word2Vec extends DefaultParamsReadable[Word2Vec] with Serializable

Annotations
@Since( "1.6.0" )
object Word2VecModel extends MLReadable[Word2VecModel] with Serializable

Annotations
@Since( "1.6.0" )

package feature

Feature transformers

Type Members

final class Binarizer extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

final class Bucketizer extends Model[Bucketizer] with HasInputCol with HasOutputCol with DefaultParamsWritable

final class ChiSqSelector extends Estimator[ChiSqSelectorModel] with ChiSqSelectorParams with DefaultParamsWritable

final class ChiSqSelectorModel extends Model[ChiSqSelectorModel] with ChiSqSelectorParams with MLWritable

class CountVectorizer extends Estimator[CountVectorizerModel] with CountVectorizerParams with DefaultParamsWritable

class CountVectorizerModel extends Model[CountVectorizerModel] with CountVectorizerParams with MLWritable

class DCT extends UnaryTransformer[Vector, Vector, DCT] with DefaultParamsWritable

class ElementwiseProduct extends UnaryTransformer[Vector, Vector, ElementwiseProduct] with DefaultParamsWritable

class HashingTF extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

final class IDF extends Estimator[IDFModel] with IDFBase with DefaultParamsWritable

class IDFModel extends Model[IDFModel] with IDFBase with MLWritable

class IndexToString extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

class Interaction extends Transformer with HasInputCols with HasOutputCol with DefaultParamsWritable

case class LabeledPoint(label: Double, features: Vector) extends Product with Serializable

class MaxAbsScaler extends Estimator[MaxAbsScalerModel] with MaxAbsScalerParams with DefaultParamsWritable

class MaxAbsScalerModel extends Model[MaxAbsScalerModel] with MaxAbsScalerParams with MLWritable

class MinMaxScaler extends Estimator[MinMaxScalerModel] with MinMaxScalerParams with DefaultParamsWritable

class MinMaxScalerModel extends Model[MinMaxScalerModel] with MinMaxScalerParams with MLWritable

class NGram extends UnaryTransformer[Seq[String], Seq[String], NGram] with DefaultParamsWritable

class Normalizer extends UnaryTransformer[Vector, Vector, Normalizer] with DefaultParamsWritable

class OneHotEncoder extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

class PCA extends Estimator[PCAModel] with PCAParams with DefaultParamsWritable

class PCAModel extends Model[PCAModel] with PCAParams with MLWritable

class PolynomialExpansion extends UnaryTransformer[Vector, Vector, PolynomialExpansion] with DefaultParamsWritable

final class QuantileDiscretizer extends Estimator[Bucketizer] with QuantileDiscretizerBase with DefaultParamsWritable

class RFormula extends Estimator[RFormulaModel] with RFormulaBase with DefaultParamsWritable

class RFormulaModel extends Model[RFormulaModel] with RFormulaBase with MLWritable

class RegexTokenizer extends UnaryTransformer[String, Seq[String], RegexTokenizer] with DefaultParamsWritable

class SQLTransformer extends Transformer with DefaultParamsWritable

class StandardScaler extends Estimator[StandardScalerModel] with StandardScalerParams with DefaultParamsWritable

class StandardScalerModel extends Model[StandardScalerModel] with StandardScalerParams with MLWritable

class StopWordsRemover extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

class StringIndexer extends Estimator[StringIndexerModel] with StringIndexerBase with DefaultParamsWritable

class StringIndexerModel extends Model[StringIndexerModel] with StringIndexerBase with MLWritable

class Tokenizer extends UnaryTransformer[String, Seq[String], Tokenizer] with DefaultParamsWritable

class VectorAssembler extends Transformer with HasInputCols with HasOutputCol with DefaultParamsWritable

class VectorIndexer extends Estimator[VectorIndexerModel] with VectorIndexerParams with DefaultParamsWritable

class VectorIndexerModel extends Model[VectorIndexerModel] with VectorIndexerParams with MLWritable

final class VectorSlicer extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

final class Word2Vec extends Estimator[Word2VecModel] with Word2VecBase with DefaultParamsWritable

class Word2VecModel extends Model[Word2VecModel] with Word2VecBase with MLWritable

Value Members

object Binarizer extends DefaultParamsReadable[Binarizer] with Serializable

object Bucketizer extends DefaultParamsReadable[Bucketizer] with Serializable

object ChiSqSelector extends DefaultParamsReadable[ChiSqSelector] with Serializable

object ChiSqSelectorModel extends MLReadable[ChiSqSelectorModel] with Serializable

object CountVectorizer extends DefaultParamsReadable[CountVectorizer] with Serializable

object CountVectorizerModel extends MLReadable[CountVectorizerModel] with Serializable

object DCT extends DefaultParamsReadable[DCT] with Serializable

object ElementwiseProduct extends DefaultParamsReadable[ElementwiseProduct] with Serializable

object HashingTF extends DefaultParamsReadable[HashingTF] with Serializable

object IDF extends DefaultParamsReadable[IDF] with Serializable

object IDFModel extends MLReadable[IDFModel] with Serializable

object IndexToString extends DefaultParamsReadable[IndexToString] with Serializable

object Interaction extends DefaultParamsReadable[Interaction] with Serializable

object MaxAbsScaler extends DefaultParamsReadable[MaxAbsScaler] with Serializable

object MaxAbsScalerModel extends MLReadable[MaxAbsScalerModel] with Serializable

object MinMaxScaler extends DefaultParamsReadable[MinMaxScaler] with Serializable

object MinMaxScalerModel extends MLReadable[MinMaxScalerModel] with Serializable

object NGram extends DefaultParamsReadable[NGram] with Serializable

object Normalizer extends DefaultParamsReadable[Normalizer] with Serializable

object OneHotEncoder extends DefaultParamsReadable[OneHotEncoder] with Serializable

object PCA extends DefaultParamsReadable[PCA] with Serializable

object PCAModel extends MLReadable[PCAModel] with Serializable

object PolynomialExpansion extends DefaultParamsReadable[PolynomialExpansion] with Serializable

object QuantileDiscretizer extends DefaultParamsReadable[QuantileDiscretizer] with Logging with Serializable

object RFormula extends DefaultParamsReadable[RFormula] with Serializable

object RFormulaModel extends MLReadable[RFormulaModel] with Serializable

object RegexTokenizer extends DefaultParamsReadable[RegexTokenizer] with Serializable

object SQLTransformer extends DefaultParamsReadable[SQLTransformer] with Serializable

object StandardScaler extends DefaultParamsReadable[StandardScaler] with Serializable

object StandardScalerModel extends MLReadable[StandardScalerModel] with Serializable

object StopWordsRemover extends DefaultParamsReadable[StopWordsRemover] with Serializable

object StringIndexer extends DefaultParamsReadable[StringIndexer] with Serializable

object StringIndexerModel extends MLReadable[StringIndexerModel] with Serializable

object Tokenizer extends DefaultParamsReadable[Tokenizer] with Serializable