public class ChiSqTest
extends Object
Vectors
, whereas test of independence is conducted
on an input of type Matrix
in which independence between columns is assessed.
We also provide a method for computing the chi-squared statistic between each feature and the
label for an input RDD[LabeledPoint]
, return an Array[ChiSquaredTestResult]
of size =
number of features in the input RDD.
Supported methods for goodness of fit: pearson
(default)
Supported methods for independence: pearson
(default)
More information on Chi-squared test: http://en.wikipedia.org/wiki/Chi-squared_test
Modifier and Type | Class and Description |
---|---|
static class |
ChiSqTest.Method
param: name String name for the method.
|
static class |
ChiSqTest.Method$ |
static class |
ChiSqTest.NullHypothesis$ |
Constructor and Description |
---|
ChiSqTest() |
Modifier and Type | Method and Description |
---|---|
static ChiSqTestResult |
chiSquared(Vector observed,
Vector expected,
String methodName) |
static ChiSqTestResult[] |
chiSquaredFeatures(RDD<LabeledPoint> data,
String methodName)
Conduct Pearson's independence test for each feature against the label across the input RDD.
|
static ChiSqTestResult |
chiSquaredMatrix(Matrix counts,
String methodName) |
static void |
org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) |
static org.slf4j.Logger |
org$apache$spark$internal$Logging$$log_() |
static ChiSqTest.Method |
PEARSON() |
public static ChiSqTest.Method PEARSON()
public static ChiSqTestResult[] chiSquaredFeatures(RDD<LabeledPoint> data, String methodName)
data
- (undocumented)methodName
- (undocumented)public static ChiSqTestResult chiSquared(Vector observed, Vector expected, String methodName)
public static ChiSqTestResult chiSquaredMatrix(Matrix counts, String methodName)
public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)