public interface CountVectorizerParams extends Params, HasInputCol, HasOutputCol
CountVectorizer and CountVectorizerModel.| Modifier and Type | Method and Description |
|---|---|
BooleanParam |
binary()
Binary toggle to control the output vector values.
|
boolean |
getBinary() |
double |
getMaxDF() |
double |
getMinDF() |
double |
getMinTF() |
int |
getVocabSize() |
DoubleParam |
maxDF()
Specifies the maximum number of different documents a term could appear in to be included
in the vocabulary.
|
DoubleParam |
minDF()
Specifies the minimum number of different documents a term must appear in to be included
in the vocabulary.
|
DoubleParam |
minTF()
Filter to ignore rare words in a document.
|
StructType |
validateAndTransformSchema(StructType schema)
Validates and transforms the input schema.
|
IntParam |
vocabSize()
Max size of the vocabulary.
|
getInputCol, inputColgetOutputCol, outputColclear, copy, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwntoString, uidBooleanParam binary()
boolean getBinary()
double getMaxDF()
double getMinDF()
double getMinTF()
int getVocabSize()
DoubleParam maxDF()
Default: (2^63^) - 1
DoubleParam minDF()
Default: 1.0
DoubleParam minTF()
Note that the parameter is only used in transform of CountVectorizerModel and does not
affect fitting.
Default: 1.0
StructType validateAndTransformSchema(StructType schema)
IntParam vocabSize()
Default: 2^18^