Class

org.apache.spark.mllib.clustering

GaussianMixture

Related Doc: package clustering

Permalink

class GaussianMixture extends Serializable

This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). A GMM represents a composite distribution of independent Gaussian distributions with associated "mixing" weights specifying each's contribution to the composite.

Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.

Note: For high-dimensional data (with many features), this algorithm may perform poorly. This is due to high-dimensional data (a) making it difficult to cluster at all (based on statistical/theoretical arguments) and (b) numerical issues with Gaussian distributions.

Annotations
@Since( "1.3.0" )
Source
GaussianMixture.scala
Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GaussianMixture
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GaussianMixture()

    Permalink

    Constructs a default instance.

    Constructs a default instance. The default parameters are {k: 2, convergenceTol: 0.01, maxIterations: 100, seed: random}.

    Annotations
    @Since( "1.3.0" )

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def getConvergenceTol: Double

    Permalink

    Return the largest change in log-likelihood at which convergence is considered to have occurred.

    Return the largest change in log-likelihood at which convergence is considered to have occurred.

    Annotations
    @Since( "1.3.0" )
  11. def getInitialModel: Option[GaussianMixtureModel]

    Permalink

    Return the user supplied initial GMM, if supplied

    Return the user supplied initial GMM, if supplied

    Annotations
    @Since( "1.3.0" )
  12. def getK: Int

    Permalink

    Return the number of Gaussians in the mixture model

    Return the number of Gaussians in the mixture model

    Annotations
    @Since( "1.3.0" )
  13. def getMaxIterations: Int

    Permalink

    Return the maximum number of iterations allowed

    Return the maximum number of iterations allowed

    Annotations
    @Since( "1.3.0" )
  14. def getSeed: Long

    Permalink

    Return the random seed

    Return the random seed

    Annotations
    @Since( "1.3.0" )
  15. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  18. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. def run(data: JavaRDD[Vector]): GaussianMixtureModel

    Permalink

    Java-friendly version of run()

    Java-friendly version of run()

    Annotations
    @Since( "1.3.0" )
  21. def run(data: RDD[Vector]): GaussianMixtureModel

    Permalink

    Perform expectation maximization

    Perform expectation maximization

    Annotations
    @Since( "1.3.0" )
  22. def setConvergenceTol(convergenceTol: Double): GaussianMixture.this.type

    Permalink

    Set the largest change in log-likelihood at which convergence is considered to have occurred.

    Set the largest change in log-likelihood at which convergence is considered to have occurred.

    Annotations
    @Since( "1.3.0" )
  23. def setInitialModel(model: GaussianMixtureModel): GaussianMixture.this.type

    Permalink

    Set the initial GMM starting point, bypassing the random initialization.

    Set the initial GMM starting point, bypassing the random initialization. You must call setK() prior to calling this method, and the condition (model.k == this.k) must be met; failure will result in an IllegalArgumentException

    Annotations
    @Since( "1.3.0" )
  24. def setK(k: Int): GaussianMixture.this.type

    Permalink

    Set the number of Gaussians in the mixture model.

    Set the number of Gaussians in the mixture model. Default: 2

    Annotations
    @Since( "1.3.0" )
  25. def setMaxIterations(maxIterations: Int): GaussianMixture.this.type

    Permalink

    Set the maximum number of iterations allowed.

    Set the maximum number of iterations allowed. Default: 100

    Annotations
    @Since( "1.3.0" )
  26. def setSeed(seed: Long): GaussianMixture.this.type

    Permalink

    Set the random seed

    Set the random seed

    Annotations
    @Since( "1.3.0" )
  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  28. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  29. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped