org.apache.spark.mllib.clustering
Class PowerIterationClustering

Object
  extended by org.apache.spark.mllib.clustering.PowerIterationClustering
All Implemented Interfaces:
java.io.Serializable

public class PowerIterationClustering
extends Object
implements scala.Serializable

See Also:
Serialized Form

Nested Class Summary
static class PowerIterationClustering.Assignment
          :: Experimental :: Cluster assignment.
static class PowerIterationClustering.Assignment$
           
 
Constructor Summary
PowerIterationClustering()
           
 
Method Summary
 PowerIterationClusteringModel run(JavaRDD<scala.Tuple3<Long,Long,Double>> similarities)
          A Java-friendly version of PowerIterationClustering.run.
 PowerIterationClusteringModel run(RDD<scala.Tuple3<Object,Object,Object>> similarities)
          Run the PIC algorithm.
 PowerIterationClustering setInitializationMode(String mode)
          Set the initialization mode.
 PowerIterationClustering setK(int k)
           
 PowerIterationClustering setMaxIterations(int maxIterations)
           
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PowerIterationClustering

public PowerIterationClustering()
Method Detail

setK

public PowerIterationClustering setK(int k)

setMaxIterations

public PowerIterationClustering setMaxIterations(int maxIterations)

setInitializationMode

public PowerIterationClustering setInitializationMode(String mode)
Set the initialization mode. This can be either "random" to use a random vector as vertex properties, or "degree" to use normalized sum similarities. Default: random.

Parameters:
mode - (undocumented)
Returns:
(undocumented)

run

public PowerIterationClusteringModel run(RDD<scala.Tuple3<Object,Object,Object>> similarities)
Run the PIC algorithm.

Parameters:
similarities - an RDD of (i, j, s,,ij,,) tuples representing the affinity matrix, which is the matrix A in the PIC paper. The similarity s,,ij,, must be nonnegative. This is a symmetric matrix and hence s,,ij,, = s,,ji,,. For any (i, j) with nonzero similarity, there should be either (i, j, s,,ij,,) or (j, i, s,,ji,,) in the input. Tuples with i = j are ignored, because we assume s,,ij,, = 0.0.

Returns:
a PowerIterationClusteringModel that contains the clustering result

run

public PowerIterationClusteringModel run(JavaRDD<scala.Tuple3<Long,Long,Double>> similarities)
A Java-friendly version of PowerIterationClustering.run.

Parameters:
similarities - (undocumented)
Returns:
(undocumented)