Class TriangleCount

Object
org.apache.spark.graphx.lib.TriangleCount

public class TriangleCount extends Object
Compute the number of triangles passing through each vertex.

The algorithm is relatively straightforward and can be computed in three steps:

  • Compute the set of neighbors for each vertex
  • For each edge compute the intersection of the sets and send the count to both vertices.
  • Compute the sum at each vertex and divide by two since each triangle is counted twice.

There are two implementations. The default TriangleCount.run implementation first removes self cycles and canonicalizes the graph to ensure that the following conditions hold:

  • There are no self edges
  • All edges are oriented (src is greater than dst)
  • There are no duplicate edges
However, the canonicalization procedure is costly as it requires repartitioning the graph. If the input data is already in "canonical form" with self cycles removed then the TriangleCount.runPreCanonicalized should be used instead.


 val canonicalGraph = graph.mapEdges(e => 1).removeSelfEdges().canonicalizeEdges()
 val counts = TriangleCount.runPreCanonicalized(canonicalGraph).vertices
 

  • Constructor Details

    • TriangleCount

      public TriangleCount()
  • Method Details

    • run

      public static <VD, ED> Graph<Object,ED> run(Graph<VD,ED> graph, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
    • runPreCanonicalized

      public static <VD, ED> Graph<Object,ED> runPreCanonicalized(Graph<VD,ED> graph, scala.reflect.ClassTag<VD> evidence$3, scala.reflect.ClassTag<ED> evidence$4)