class StreamingKMeans extends Logging with Serializable
StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, and using the model to make predictions on streaming data. See KMeansModel for details on algorithm and update rules.
Use a builder pattern to construct a streaming k-means analysis in an application, like:
val model = new StreamingKMeans() .setDecayFactor(0.5) .setK(3) .setRandomCenters(5, 100.0) .trainOn(DStream)
- Annotations
- @Since( "1.2.0" )
- Source
- StreamingKMeans.scala
- Alphabetic
- By Inheritance
- StreamingKMeans
- Serializable
- Serializable
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
var
decayFactor: Double
- Annotations
- @Since( "1.2.0" )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
var
k: Int
- Annotations
- @Since( "1.2.0" )
-
def
latestModel(): StreamingKMeansModel
Return the latest model.
Return the latest model.
- Annotations
- @Since( "1.2.0" )
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
var
model: StreamingKMeansModel
- Attributes
- protected
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
predictOn(data: JavaDStream[Vector]): JavaDStream[Integer]
Java-friendly version of
predictOn
.Java-friendly version of
predictOn
.- Annotations
- @Since( "1.4.0" )
-
def
predictOn(data: DStream[Vector]): DStream[Int]
Use the clustering model to make predictions on batches of data from a DStream.
Use the clustering model to make predictions on batches of data from a DStream.
- data
DStream containing vector data
- returns
DStream containing predictions
- Annotations
- @Since( "1.2.0" )
-
def
predictOnValues[K](data: JavaPairDStream[K, Vector]): JavaPairDStream[K, Integer]
Java-friendly version of
predictOnValues
.Java-friendly version of
predictOnValues
.- Annotations
- @Since( "1.4.0" )
-
def
predictOnValues[K](data: DStream[(K, Vector)])(implicit arg0: ClassTag[K]): DStream[(K, Int)]
Use the model to make predictions on the values of a DStream and carry over its keys.
Use the model to make predictions on the values of a DStream and carry over its keys.
- K
key type
- data
DStream containing (key, feature vector) pairs
- returns
DStream containing the input keys and the predictions as values
- Annotations
- @Since( "1.2.0" )
-
def
setDecayFactor(a: Double): StreamingKMeans.this.type
Set the forgetfulness of the previous centroids.
Set the forgetfulness of the previous centroids.
- Annotations
- @Since( "1.2.0" )
-
def
setHalfLife(halfLife: Double, timeUnit: String): StreamingKMeans.this.type
Set the half life and time unit ("batches" or "points").
Set the half life and time unit ("batches" or "points"). If points, then the decay factor is raised to the power of number of new points and if batches, then decay factor will be used as is.
- Annotations
- @Since( "1.2.0" )
-
def
setInitialCenters(centers: Array[Vector], weights: Array[Double]): StreamingKMeans.this.type
Specify initial centers directly.
Specify initial centers directly.
- Annotations
- @Since( "1.2.0" )
-
def
setK(k: Int): StreamingKMeans.this.type
Set the number of clusters.
Set the number of clusters.
- Annotations
- @Since( "1.2.0" )
-
def
setRandomCenters(dim: Int, weight: Double, seed: Long = Utils.random.nextLong): StreamingKMeans.this.type
Initialize random centers, requiring only the number of dimensions.
Initialize random centers, requiring only the number of dimensions.
- dim
Number of dimensions
- weight
Weight for each center
- seed
Random seed
- Annotations
- @Since( "1.2.0" )
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
var
timeUnit: String
- Annotations
- @Since( "1.2.0" )
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
trainOn(data: JavaDStream[Vector]): Unit
Java-friendly version of
trainOn
.Java-friendly version of
trainOn
.- Annotations
- @Since( "1.4.0" )
-
def
trainOn(data: DStream[Vector]): Unit
Update the clustering model by training on batches of data from a DStream.
Update the clustering model by training on batches of data from a DStream. This operation registers a DStream for training the model, checks whether the cluster centers have been initialized, and updates the model using each batch of data from the stream.
- data
DStream containing vector data
- Annotations
- @Since( "1.2.0" )
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()