org.apache.spark.mllib.clustering

PowerIterationClustering

class PowerIterationClustering extends Serializable

:: Experimental ::

Power Iteration Clustering (PIC), a scalable graph clustering algorithm developed by Lin and Cohen. From the abstract: PIC finds a very low-dimensional embedding of a dataset using truncated power iteration on a normalized pair-wise similarity matrix of the data.

Annotations
@Experimental() @Since( "1.3.0" )
See also

Spectral clustering (Wikipedia)

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. PowerIterationClustering
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PowerIterationClustering()

    Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100, initMode: "random"}.

    Constructs a PIC instance with default parameters: {k: 2, maxIterations: 100, initMode: "random"}.

    Annotations
    @Since( "1.3.0" )

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  17. def run(similarities: JavaRDD[(Long, Long, Double)]): PowerIterationClusteringModel

    A Java-friendly version of PowerIterationClustering.run.

    A Java-friendly version of PowerIterationClustering.run.

    Annotations
    @Since( "1.3.0" )
  18. def run(similarities: RDD[(Long, Long, Double)]): PowerIterationClusteringModel

    Run the PIC algorithm.

    Run the PIC algorithm.

    similarities

    an RDD of (i, j, sij) tuples representing the affinity matrix, which is the matrix A in the PIC paper. The similarity sij must be nonnegative. This is a symmetric matrix and hence sij = sji. For any (i, j) with nonzero similarity, there should be either (i, j, sij) or (j, i, sji) in the input. Tuples with i = j are ignored, because we assume sij = 0.0.

    returns

    a PowerIterationClusteringModel that contains the clustering result

    Annotations
    @Since( "1.3.0" )
  19. def run(graph: Graph[Double, Double]): PowerIterationClusteringModel

    Run the PIC algorithm on Graph.

    Run the PIC algorithm on Graph.

    graph

    an affinity matrix represented as graph, which is the matrix A in the PIC paper. The similarity sij represented as the edge between vertices (i, j) must be nonnegative. This is a symmetric matrix and hence sij = sji. For any (i, j) with nonzero similarity, there should be either (i, j, sij) or (j, i, sji) in the input. Tuples with i = j are ignored, because we assume sij = 0.0.

    returns

    a PowerIterationClusteringModel that contains the clustering result

    Annotations
    @Since( "1.5.0" )
  20. def setInitializationMode(mode: String): PowerIterationClustering.this.type

    Set the initialization mode.

    Set the initialization mode. This can be either "random" to use a random vector as vertex properties, or "degree" to use normalized sum similarities. Default: random.

    Annotations
    @Since( "1.3.0" )
  21. def setK(k: Int): PowerIterationClustering.this.type

    Set the number of clusters.

    Set the number of clusters.

    Annotations
    @Since( "1.3.0" )
  22. def setMaxIterations(maxIterations: Int): PowerIterationClustering.this.type

    Set maximum number of iterations of the power iteration loop

    Set maximum number of iterations of the power iteration loop

    Annotations
    @Since( "1.3.0" )
  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  24. def toString(): String

    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped