Connected components algorithm.
Label Propagation algorithm.
PageRank algorithm implementation.
PageRank algorithm implementation. There are two implementations of PageRank implemented.
The first implementation uses the standalone Graph interface and runs PageRank for a fixed number of iterations:
var PR = Array.fill(n)( 1.0 ) val oldPR = Array.fill(n)( 1.0 ) for( iter <- 0 until numIter ) { swap(oldPR, PR) for( i <- 0 until n ) { PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum } }
The second implementation uses the Pregel interface and runs PageRank until convergence:
var PR = Array.fill(n)( 1.0 ) val oldPR = Array.fill(n)( 0.0 ) while( max(abs(PR - oldPr)) > tol ) { swap(oldPR, PR) for( i <- 0 until n if abs(PR[i] - oldPR[i]) > tol ) { PR[i] = alpha + (1 - \alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum } }
alpha
is the random reset probability (typically 0.15), inNbrs[i]
is the set of
neighbors which link to i
and outDeg[j]
is the out degree of vertex j
.
Note that this is not the "normalized" PageRank and as a consequence pages that have no inlinks will have a PageRank of alpha.
Implementation of SVD++ algorithm.
Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
Strongly connected components algorithm implementation.
Compute the number of triangles passing through each vertex.
Compute the number of triangles passing through each vertex.
The algorithm is relatively straightforward and can be computed in three steps:
There are two implementations. The default TriangleCount.run
implementation first removes
self cycles and canonicalizes the graph to ensure that the following conditions hold:
However, the canonicalization procedure is costly as it requires repartitioning the graph.
If the input data is already in "canonical form" with self cycles removed then the
TriangleCount.runPreCanonicalized
should be used instead.
val canonicalGraph = graph.mapEdges(e => 1).removeSelfEdges().canonicalizeEdges() val counts = TriangleCount.runPreCanonicalized(canonicalGraph).vertices
Various analytics functions for graphs.