GaussianMixture

This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). A GMM represents a composite distribution of independent Gaussian distributions with associated "mixing" weights specifying each's contribution to the composite.

Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.

Note: For high-dimensional data (with many features), this algorithm may perform poorly. This is due to high-dimensional data (a) making it difficult to cluster at all (based on statistical/theoretical arguments) and (b) numerical issues with Gaussian distributions.

Annotations: @Since( "1.3.0" )
Source: GaussianMixture.scala

Linear Supertypes

Serializable, Serializable, AnyRef, Any

Instance Constructors

new GaussianMixture()

Constructs a default instance.
Constructs a default instance. The default parameters are {k: 2, convergenceTol: 0.01, maxIterations: 100, seed: random}.

Annotations
@Since( "1.3.0" )

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getConvergenceTol: Double

Return the largest change in log-likelihood at which convergence is considered to have occurred.
Return the largest change in log-likelihood at which convergence is considered to have occurred.

Annotations
@Since( "1.3.0" )
def getInitialModel: Option[GaussianMixtureModel]

Return the user supplied initial GMM, if supplied
Return the user supplied initial GMM, if supplied

Annotations
@Since( "1.3.0" )
def getK: Int

Return the number of Gaussians in the mixture model
Return the number of Gaussians in the mixture model

Annotations
@Since( "1.3.0" )
def getMaxIterations: Int

Return the maximum number of iterations allowed
Return the maximum number of iterations allowed

Annotations
@Since( "1.3.0" )
def getSeed: Long

Return the random seed
Return the random seed

Annotations
@Since( "1.3.0" )
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def run(data: JavaRDD[Vector]): GaussianMixtureModel

Java-friendly version of run()
Java-friendly version of run()

Annotations
@Since( "1.3.0" )
def run(data: RDD[Vector]): GaussianMixtureModel

Perform expectation maximization
Perform expectation maximization

Annotations
@Since( "1.3.0" )
def setConvergenceTol(convergenceTol: Double): GaussianMixture.this.type

Set the largest change in log-likelihood at which convergence is considered to have occurred.
Set the largest change in log-likelihood at which convergence is considered to have occurred.

Annotations
@Since( "1.3.0" )
def setInitialModel(model: GaussianMixtureModel): GaussianMixture.this.type

Set the initial GMM starting point, bypassing the random initialization.
Set the initial GMM starting point, bypassing the random initialization. You must call setK() prior to calling this method, and the condition (model.k == this.k) must be met; failure will result in an IllegalArgumentException

Annotations
@Since( "1.3.0" )
def setK(k: Int): GaussianMixture.this.type

Set the number of Gaussians in the mixture model.
Set the number of Gaussians in the mixture model. Default: 2

Annotations
@Since( "1.3.0" )
def setMaxIterations(maxIterations: Int): GaussianMixture.this.type

Set the maximum number of iterations allowed.
Set the maximum number of iterations allowed. Default: 100

Annotations
@Since( "1.3.0" )
def setSeed(seed: Long): GaussianMixture.this.type

Set the random seed
Set the random seed

Annotations
@Since( "1.3.0" )
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package clustering

class GaussianMixture extends Serializable

Instance Constructors

new GaussianMixture()

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def getConvergenceTol: Double

def getInitialModel: Option[GaussianMixtureModel]

def getK: Int

def getMaxIterations: Int

def getSeed: Long

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def run(data: JavaRDD[Vector]): GaussianMixtureModel

def run(data: RDD[Vector]): GaussianMixtureModel

def setConvergenceTol(convergenceTol: Double): GaussianMixture.this.type

def setInitialModel(model: GaussianMixtureModel): GaussianMixture.this.type

def setK(k: Int): GaussianMixture.this.type

def setMaxIterations(maxIterations: Int): GaussianMixture.this.type

def setSeed(seed: Long): GaussianMixture.this.type

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped