org.apache.spark.mllib.clustering

GaussianMixture

class GaussianMixture extends Serializable

:: Experimental ::

This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). A GMM represents a composite distribution of independent Gaussian distributions with associated "mixing" weights specifying each's contribution to the composite.

Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.

Note: For high-dimensional data (with many features), this algorithm may perform poorly. This is due to high-dimensional data (a) making it difficult to cluster at all (based on statistical/theoretical arguments) and (b) numerical issues with Gaussian distributions.

Annotations
@Experimental()
Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. GaussianMixture
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GaussianMixture()

    Constructs a default instance.

    Constructs a default instance. The default parameters are {k: 2, convergenceTol: 0.01, maxIterations: 100, seed: random}.

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def getConvergenceTol: Double

    Return the largest change in log-likelihood at which convergence is considered to have occurred.

  13. def getInitialModel: Option[GaussianMixtureModel]

    Return the user supplied initial GMM, if supplied

  14. def getK: Int

    Return the number of Gaussians in the mixture model

  15. def getMaxIterations: Int

    Return the maximum number of iterations to run

  16. def getSeed: Long

    Return the random seed

  17. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  18. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  19. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  20. final def notify(): Unit

    Definition Classes
    AnyRef
  21. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  22. def run(data: RDD[Vector]): GaussianMixtureModel

    Perform expectation maximization

  23. def setConvergenceTol(convergenceTol: Double): GaussianMixture.this.type

    Set the largest change in log-likelihood at which convergence is considered to have occurred.

  24. def setInitialModel(model: GaussianMixtureModel): GaussianMixture.this.type

    Set the initial GMM starting point, bypassing the random initialization.

    Set the initial GMM starting point, bypassing the random initialization. You must call setK() prior to calling this method, and the condition (model.k == this.k) must be met; failure will result in an IllegalArgumentException

  25. def setK(k: Int): GaussianMixture.this.type

    Set the number of Gaussians in the mixture model.

    Set the number of Gaussians in the mixture model. Default: 2

  26. def setMaxIterations(maxIterations: Int): GaussianMixture.this.type

    Set the maximum number of iterations to run.

    Set the maximum number of iterations to run. Default: 100

  27. def setSeed(seed: Long): GaussianMixture.this.type

    Set the random seed

  28. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  29. def toString(): String

    Definition Classes
    AnyRef → Any
  30. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped