class KMeans extends Serializable with Logging
K-means clustering with a k-means++ like initialization mode (the k-means|| algorithm by Bahmani et al).
This is an iterative algorithm that will make multiple passes over the data, so any RDDs given to it should be cached by the user.
- Annotations
- @Since("0.8.0")
- Source
- KMeans.scala
- Alphabetic
- By Inheritance
- KMeans
- Logging
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
-    new KMeans()Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, initializationMode: "k-means||", initializationSteps: 2, epsilon: 1e-4, seed: random, distanceMeasure: "euclidean"}. Constructs a KMeans instance with default parameters: {k: 2, maxIterations: 20, initializationMode: "k-means||", initializationSteps: 2, epsilon: 1e-4, seed: random, distanceMeasure: "euclidean"}. - Annotations
- @Since("0.8.0")
 
Type Members
-   implicit  class LogStringContext extends AnyRef- Definition Classes
- Logging
 
Value Members
-   final  def !=(arg0: Any): Boolean- Definition Classes
- AnyRef → Any
 
-   final  def ##: Int- Definition Classes
- AnyRef → Any
 
-   final  def ==(arg0: Any): Boolean- Definition Classes
- AnyRef → Any
 
-    def MDC(key: LogKey, value: Any): MDC- Attributes
- protected
- Definition Classes
- Logging
 
-   final  def asInstanceOf[T0]: T0- Definition Classes
- Any
 
-    def clone(): AnyRef- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
 
-   final  def eq(arg0: AnyRef): Boolean- Definition Classes
- AnyRef
 
-    def equals(arg0: AnyRef): Boolean- Definition Classes
- AnyRef → Any
 
-   final  def getClass(): Class[_ <: AnyRef]- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
 
-    def getDistanceMeasure: StringThe distance suite used by the algorithm. The distance suite used by the algorithm. - Annotations
- @Since("2.4.0")
 
-    def getEpsilon: DoubleThe distance threshold within which we've consider centers to have converged. The distance threshold within which we've consider centers to have converged. - Annotations
- @Since("1.4.0")
 
-    def getInitializationMode: StringThe initialization algorithm. The initialization algorithm. This can be either "random" or "k-means||". - Annotations
- @Since("1.4.0")
 
-    def getInitializationSteps: IntNumber of steps for the k-means|| initialization mode Number of steps for the k-means|| initialization mode - Annotations
- @Since("1.4.0")
 
-    def getK: IntNumber of clusters to create (k). Number of clusters to create (k). - Annotations
- @Since("1.4.0")
- Note
- It is possible for fewer than k clusters to be returned, for example, if there are fewer than k distinct points to cluster. 
 
-    def getMaxIterations: IntMaximum number of iterations allowed. Maximum number of iterations allowed. - Annotations
- @Since("1.4.0")
 
-    def getSeed: LongThe random seed for cluster initialization. The random seed for cluster initialization. - Annotations
- @Since("1.4.0")
 
-    def hashCode(): Int- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
 
-    def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean- Attributes
- protected
- Definition Classes
- Logging
 
-    def initializeLogIfNecessary(isInterpreter: Boolean): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-   final  def isInstanceOf[T0]: Boolean- Definition Classes
- Any
 
-    def isTraceEnabled(): Boolean- Attributes
- protected
- Definition Classes
- Logging
 
-    def log: Logger- Attributes
- protected
- Definition Classes
- Logging
 
-    def logBasedOnLevel(level: Level)(f: => MessageWithContext): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logDebug(msg: => String, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logDebug(entry: LogEntry, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logDebug(entry: LogEntry): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logDebug(msg: => String): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logError(msg: => String, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logError(entry: LogEntry, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logError(entry: LogEntry): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logError(msg: => String): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logInfo(msg: => String, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logInfo(entry: LogEntry, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logInfo(entry: LogEntry): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logInfo(msg: => String): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logName: String- Attributes
- protected
- Definition Classes
- Logging
 
-    def logTrace(msg: => String, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logTrace(entry: LogEntry, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logTrace(entry: LogEntry): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logTrace(msg: => String): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logWarning(msg: => String, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logWarning(entry: LogEntry, throwable: Throwable): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logWarning(entry: LogEntry): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-    def logWarning(msg: => String): Unit- Attributes
- protected
- Definition Classes
- Logging
 
-   final  def ne(arg0: AnyRef): Boolean- Definition Classes
- AnyRef
 
-   final  def notify(): Unit- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
 
-   final  def notifyAll(): Unit- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
 
-    def run(data: RDD[Vector]): KMeansModelTrain a K-means model on the given set of points; datashould be cached for high performance, because this is an iterative algorithm.Train a K-means model on the given set of points; datashould be cached for high performance, because this is an iterative algorithm.- Annotations
- @Since("0.8.0")
 
-    def setDistanceMeasure(distanceMeasure: String): KMeans.this.typeSet the distance suite used by the algorithm. Set the distance suite used by the algorithm. - Annotations
- @Since("2.4.0")
 
-    def setEpsilon(epsilon: Double): KMeans.this.typeSet the distance threshold within which we've consider centers to have converged. Set the distance threshold within which we've consider centers to have converged. If all centers move less than this Euclidean distance, we stop iterating one run. - Annotations
- @Since("0.8.0")
 
-    def setInitialModel(model: KMeansModel): KMeans.this.typeSet the initial starting point, bypassing the random initialization or k-means|| The condition model.k == this.k must be met, failure results in an IllegalArgumentException. Set the initial starting point, bypassing the random initialization or k-means|| The condition model.k == this.k must be met, failure results in an IllegalArgumentException. - Annotations
- @Since("1.4.0")
 
-    def setInitializationMode(initializationMode: String): KMeans.this.typeSet the initialization algorithm. Set the initialization algorithm. This can be either "random" to choose random points as initial cluster centers, or "k-means||" to use a parallel variant of k-means++ (Bahmani et al., Scalable K-Means++, VLDB 2012). Default: k-means||. - Annotations
- @Since("0.8.0")
 
-    def setInitializationSteps(initializationSteps: Int): KMeans.this.typeSet the number of steps for the k-means|| initialization mode. Set the number of steps for the k-means|| initialization mode. This is an advanced setting -- the default of 2 is almost always enough. Default: 2. - Annotations
- @Since("0.8.0")
 
-    def setK(k: Int): KMeans.this.typeSet the number of clusters to create (k). Set the number of clusters to create (k). - Annotations
- @Since("0.8.0")
- Note
- It is possible for fewer than k clusters to be returned, for example, if there are fewer than k distinct points to cluster. Default: 2. 
 
-    def setMaxIterations(maxIterations: Int): KMeans.this.typeSet maximum number of iterations allowed. Set maximum number of iterations allowed. Default: 20. - Annotations
- @Since("0.8.0")
 
-    def setSeed(seed: Long): KMeans.this.typeSet the random seed for cluster initialization. Set the random seed for cluster initialization. - Annotations
- @Since("1.4.0")
 
-   final  def synchronized[T0](arg0: => T0): T0- Definition Classes
- AnyRef
 
-    def toString(): String- Definition Classes
- AnyRef → Any
 
-   final  def wait(arg0: Long, arg1: Int): Unit- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
 
-   final  def wait(arg0: Long): Unit- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
 
-   final  def wait(): Unit- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
 
-    def withLogContext(context: Map[String, String])(body: => Unit): Unit- Attributes
- protected
- Definition Classes
- Logging
 
Deprecated Value Members
-    def finalize(): Unit- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
- (Since version 9)