lib

package lib

Various analytics functions for graphs.

Source: package.scala

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

lib
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Value Members

object ConnectedComponents
Connected components algorithm.
object LabelPropagation
Label Propagation algorithm.
object PageRank extends Logging
PageRank algorithm implementation.
PageRank algorithm implementation. There are two implementations of PageRank implemented.
The first implementation uses the standalone Graph interface and runs PageRank for a fixed number of iterations:
```
var PR = Array.fill(n)( 1.0 )
val oldPR = Array.fill(n)( 1.0 )
for( iter <- 0 until numIter ) {
  swap(oldPR, PR)
  for( i <- 0 until n ) {
    PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
  }
}
```
The second implementation uses the Pregel interface and runs PageRank until convergence:
```
var PR = Array.fill(n)( 1.0 )
val oldPR = Array.fill(n)( 0.0 )
while( max(abs(PR - oldPr)) > tol ) {
  swap(oldPR, PR)
  for( i <- 0 until n if abs(PR[i] - oldPR[i]) > tol ) {
    PR[i] = alpha + (1 - \alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
  }
}
```
alpha is the random reset probability (typically 0.15), inNbrs[i] is the set of neighbors which link to i and outDeg[j] is the out degree of vertex j.
Note
This is not the "normalized" PageRank and as a consequence pages that have no inlinks will have a PageRank of alpha.
object SVDPlusPlus
Implementation of SVD++ algorithm.
object ShortestPaths extends Serializable
Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.
object StronglyConnectedComponents
Strongly connected components algorithm implementation.
object TriangleCount
Compute the number of triangles passing through each vertex.
Compute the number of triangles passing through each vertex.
The algorithm is relatively straightforward and can be computed in three steps:
- Compute the set of neighbors for each vertex
- For each edge compute the intersection of the sets and send the count to both vertices.
- Compute the sum at each vertex and divide by two since each triangle is counted twice.
There are two implementations. The default TriangleCount.run implementation first removes self cycles and canonicalizes the graph to ensure that the following conditions hold:
- There are no self edges
- All edges are oriented (src is greater than dst)
- There are no duplicate edges
However, the canonicalization procedure is costly as it requires repartitioning the graph. If the input data is already in "canonical form" with self cycles removed then the TriangleCount.runPreCanonicalized should be used instead.
```
val canonicalGraph = graph.mapEdges(e => 1).removeSelfEdges().canonicalizeEdges()
val counts = TriangleCount.runPreCanonicalized(canonicalGraph).vertices
```

Packages

lib

package lib

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

lib 

package lib

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

lib