|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Object org.apache.spark.ml.feature.VectorIndexer.CategoryStats
public static class VectorIndexer.CategoryStats
Helper class for tracking unique values for each feature.
TODO: Track which features are known to be continuous already; do not update counts for them.
param: numFeatures This class fails if it encounters a Vector whose length is not numFeatures. param: maxCategories This class caps the number of unique values collected at maxCategories.
Constructor Summary | |
---|---|
VectorIndexer.CategoryStats(int numFeatures,
int maxCategories)
|
Method Summary | |
---|---|
void |
addVector(Vector v)
Add a new vector to this index, updating sets of unique feature values |
scala.collection.immutable.Map<Object,scala.collection.immutable.Map<Object,Object>> |
getCategoryMaps()
Based on stats collected, decide which features are categorical, and choose indices for categories. |
VectorIndexer.CategoryStats |
merge(VectorIndexer.CategoryStats other)
Merge with another instance, modifying this instance. |
Methods inherited from class Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public VectorIndexer.CategoryStats(int numFeatures, int maxCategories)
Method Detail |
---|
public VectorIndexer.CategoryStats merge(VectorIndexer.CategoryStats other)
public void addVector(Vector v)
public scala.collection.immutable.Map<Object,scala.collection.immutable.Map<Object,Object>> getCategoryMaps()
Sparsity: This tries to maintain sparsity by treating value 0.0 specially. If a categorical feature takes value 0.0, then value 0.0 is given index 0.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |