org.apache.spark.mllib.stat.correlation

Class PearsonCorrelation

• Object
• org.apache.spark.mllib.stat.correlation.PearsonCorrelation

• ```public class PearsonCorrelation
extends Object```
Compute Pearson correlation for two RDDs of the type RDD[Double] or the correlation matrix for an RDD of the type RDD[Vector].

Definition of Pearson correlation can be found at http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

• Constructor Summary

Constructors
Constructor and Description
`PearsonCorrelation()`
• Method Summary

Methods
Modifier and Type Method and Description
`static double` ```computeCorrelation(RDD<Object> x, RDD<Object> y)```
Compute the Pearson correlation for two datasets.
`static Matrix` `computeCorrelationMatrix(RDD<Vector> X)`
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
`static Matrix` `computeCorrelationMatrixFromCovariance(Matrix covarianceMatrix)`
Compute the Pearson correlation matrix from the covariance matrix.
`static double` ```computeCorrelationWithMatrixImpl(RDD<Object> x, RDD<Object> y)```
• Methods inherited from class Object

`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`
• Constructor Detail

• PearsonCorrelation

`public PearsonCorrelation()`
• Method Detail

• computeCorrelation

```public static double computeCorrelation(RDD<Object> x,
RDD<Object> y)```
Compute the Pearson correlation for two datasets. NaN if either vector has 0 variance.
Parameters:
`x` - (undocumented)
`y` - (undocumented)
Returns:
(undocumented)
• computeCorrelationMatrix

`public static Matrix computeCorrelationMatrix(RDD<Vector> X)`
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j. 0 covariance results in a correlation value of Double.NaN.
Parameters:
`X` - (undocumented)
Returns:
(undocumented)
• computeCorrelationMatrixFromCovariance

`public static Matrix computeCorrelationMatrixFromCovariance(Matrix covarianceMatrix)`
Compute the Pearson correlation matrix from the covariance matrix. 0 variance results in a correlation value of Double.NaN.
Parameters:
`covarianceMatrix` - (undocumented)
Returns:
(undocumented)
• computeCorrelationWithMatrixImpl

```public static double computeCorrelationWithMatrixImpl(RDD<Object> x,
RDD<Object> y)```