public class ChiSquareTest
extends Object
See Wikipedia for more information on the Chi-squared test.
| Constructor and Description | 
|---|
| ChiSquareTest() | 
| Modifier and Type | Method and Description | 
|---|---|
| static Dataset<Row> | test(Dataset<Row> dataset,
    String featuresCol,
    String labelCol)Conduct Pearson's independence test for every feature against the label. | 
| static Dataset<Row> | test(Dataset<Row> dataset,
    String featuresCol,
    String labelCol,
    boolean flatten) | 
public static Dataset<Row> test(Dataset<Row> dataset, String featuresCol, String labelCol)
The null hypothesis is that the occurrence of the outcomes is statistically independent.
dataset - DataFrame of categorical labels and categorical features.
                 Real-valued features will be treated as categorical for each distinct value.featuresCol - Name of features column in dataset, of type Vector (VectorUDT)labelCol - Name of label column in dataset, of any numerical typepValues: Vector
          - degreesOfFreedom: Array[Int]
          - statistics: Vector
         Each of these fields has one value per feature.public static Dataset<Row> test(Dataset<Row> dataset, String featuresCol, String labelCol, boolean flatten)
dataset - DataFrame of categorical labels and categorical features.
                 Real-valued features will be treated as categorical for each distinct value.featuresCol - Name of features column in dataset, of type Vector (VectorUDT)labelCol - Name of label column in dataset, of any numerical typeflatten - If false, the returned DataFrame contains only a single Row, otherwise, one
                 row per feature.