Package org.apache.spark.ml.stat
Class ANOVATest
Object
org.apache.spark.ml.stat.ANOVATest
ANOVA Test for continuous data.
See Wikipedia for more information on ANOVA test.
-
Constructor Summary
-
Method Summary
-
Constructor Details
-
ANOVATest
public ANOVATest()
-
-
Method Details
-
test
- Parameters:
dataset
- DataFrame of categorical labels and continuous features.featuresCol
- Name of features column in dataset, of typeVector
(VectorUDT
)labelCol
- Name of label column in dataset, of any numerical type- Returns:
- DataFrame containing the test result for every feature against the label.
This DataFrame will contain a single Row with the following fields:
-
pValues: Vector
-degreesOfFreedom: Array[Long]
-fValues: Vector
Each of these fields has one value per feature.
-
test
public static Dataset<Row> test(Dataset<Row> dataset, String featuresCol, String labelCol, boolean flatten) - Parameters:
dataset
- DataFrame of categorical labels and continuous features.featuresCol
- Name of features column in dataset, of typeVector
(VectorUDT
)labelCol
- Name of label column in dataset, of any numerical typeflatten
- If false, the returned DataFrame contains only a single Row, otherwise, one row per feature.- Returns:
- (undocumented)
-