public class ChiSquareTest extends Object
Chi-square hypothesis testing for categorical data.
See Wikipedia for more information on the Chi-squared test.
|Constructor and Description|
|Modifier and Type||Method and Description|
Conduct Pearson's independence test for every feature against the label.
The null hypothesis is that the occurrence of the outcomes is statistically independent.
dataset- DataFrame of categorical labels and categorical features. Real-valued features will be treated as categorical for each distinct value.
featuresCol- Name of features column in dataset, of type
labelCol- Name of label column in dataset, of any numerical type
statistics: VectorEach of these fields has one value per feature.