public class GaussianMixture
extends Object
implements scala.Serializable
Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.
param: k Number of independent Gaussians in the mixture model. param: convergenceTol Maximum change in log-likelihood at which convergence is considered to have occurred. param: maxIterations Maximum number of iterations allowed.
| Constructor and Description | 
|---|
| GaussianMixture()Constructs a default instance. | 
| Modifier and Type | Method and Description | 
|---|---|
| double | getConvergenceTol()Return the largest change in log-likelihood at which convergence is
 considered to have occurred. | 
| scala.Option<GaussianMixtureModel> | getInitialModel()Return the user supplied initial GMM, if supplied | 
| int | getK()Return the number of Gaussians in the mixture model | 
| int | getMaxIterations()Return the maximum number of iterations allowed | 
| long | getSeed()Return the random seed | 
| GaussianMixtureModel | run(JavaRDD<Vector> data)Java-friendly version of  run() | 
| GaussianMixtureModel | run(RDD<Vector> data)Perform expectation maximization | 
| GaussianMixture | setConvergenceTol(double convergenceTol)Set the largest change in log-likelihood at which convergence is
 considered to have occurred. | 
| GaussianMixture | setInitialModel(GaussianMixtureModel model)Set the initial GMM starting point, bypassing the random initialization. | 
| GaussianMixture | setK(int k)Set the number of Gaussians in the mixture model. | 
| GaussianMixture | setMaxIterations(int maxIterations)Set the maximum number of iterations allowed. | 
| GaussianMixture | setSeed(long seed)Set the random seed | 
| static boolean | shouldDistributeGaussians(int k,
                         int d)Heuristic to distribute the computation of the  MultivariateGaussians, approximately when
 d is greater than 25 except for when k is very small. | 
public GaussianMixture()
public static boolean shouldDistributeGaussians(int k,
                                                int d)
MultivariateGaussians, approximately when
 d is greater than 25 except for when k is very small.k - Number of topicsd - Number of featurespublic GaussianMixture setInitialModel(GaussianMixtureModel model)
model - (undocumented)public scala.Option<GaussianMixtureModel> getInitialModel()
public GaussianMixture setK(int k)
k - (undocumented)public int getK()
public GaussianMixture setMaxIterations(int maxIterations)
maxIterations - (undocumented)public int getMaxIterations()
public GaussianMixture setConvergenceTol(double convergenceTol)
convergenceTol - (undocumented)public double getConvergenceTol()
public GaussianMixture setSeed(long seed)
seed - (undocumented)public long getSeed()
public GaussianMixtureModel run(RDD<Vector> data)
data - (undocumented)public GaussianMixtureModel run(JavaRDD<Vector> data)
run()data - (undocumented)