org.apache.spark.graphx.lib
Class PageRank

Object
  extended by org.apache.spark.graphx.lib.PageRank
All Implemented Interfaces:
Logging

public class PageRank
extends Object
implements Logging

PageRank algorithm implementation. There are two implementations of PageRank implemented.

The first implementation uses the standalone Graph interface and runs PageRank for a fixed number of iterations:


 var PR = Array.fill(n)( 1.0 )
 val oldPR = Array.fill(n)( 1.0 )
 for( iter <- 0 until numIter ) {
   swap(oldPR, PR)
   for( i <- 0 until n ) {
     PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
   }
 }
 

The second implementation uses the Pregel interface and runs PageRank until convergence:


 var PR = Array.fill(n)( 1.0 )
 val oldPR = Array.fill(n)( 0.0 )
 while( max(abs(PR - oldPr)) > tol ) {
   swap(oldPR, PR)
   for( i <- 0 until n if abs(PR[i] - oldPR[i]) > tol ) {
     PR[i] = alpha + (1 - \alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
   }
 }
 

alpha is the random reset probability (typically 0.15), inNbrs[i] is the set of neighbors whick link to i and outDeg[j] is the out degree of vertex j.

Note that this is not the "normalized" PageRank and as a consequence pages that have no inlinks will have a PageRank of alpha.


Constructor Summary
PageRank()
           
 
Method Summary
static
<VD,ED> Graph<Object,Object>
run(Graph<VD,ED> graph, int numIter, double resetProb, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
          Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
static
<VD,ED> Graph<Object,Object>
runUntilConvergence(Graph<VD,ED> graph, double tol, double resetProb, scala.reflect.ClassTag<VD> evidence$5, scala.reflect.ClassTag<ED> evidence$6)
          Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
static
<VD,ED> Graph<Object,Object>
runUntilConvergenceWithOptions(Graph<VD,ED> graph, double tol, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$7, scala.reflect.ClassTag<ED> evidence$8)
          Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
static
<VD,ED> Graph<Object,Object>
runWithOptions(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$3, scala.reflect.ClassTag<ED> evidence$4)
          Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

PageRank

public PageRank()
Method Detail

run

public static <VD,ED> Graph<Object,Object> run(Graph<VD,ED> graph,
                                               int numIter,
                                               double resetProb,
                                               scala.reflect.ClassTag<VD> evidence$1,
                                               scala.reflect.ClassTag<ED> evidence$2)
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

Parameters:
graph - the graph on which to compute PageRank
numIter - the number of iterations of PageRank to run
resetProb - the random reset probability (alpha)

evidence$1 - (undocumented)
evidence$2 - (undocumented)
Returns:
the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

runWithOptions

public static <VD,ED> Graph<Object,Object> runWithOptions(Graph<VD,ED> graph,
                                                          int numIter,
                                                          double resetProb,
                                                          scala.Option<Object> srcId,
                                                          scala.reflect.ClassTag<VD> evidence$3,
                                                          scala.reflect.ClassTag<ED> evidence$4)
Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

Parameters:
graph - the graph on which to compute PageRank
numIter - the number of iterations of PageRank to run
resetProb - the random reset probability (alpha)
srcId - the source vertex for a Personalized Page Rank (optional)

evidence$3 - (undocumented)
evidence$4 - (undocumented)
Returns:
the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.


runUntilConvergence

public static <VD,ED> Graph<Object,Object> runUntilConvergence(Graph<VD,ED> graph,
                                                               double tol,
                                                               double resetProb,
                                                               scala.reflect.ClassTag<VD> evidence$5,
                                                               scala.reflect.ClassTag<ED> evidence$6)
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

Parameters:
graph - the graph on which to compute PageRank
tol - the tolerance allowed at convergence (smaller => more accurate).
resetProb - the random reset probability (alpha)

evidence$5 - (undocumented)
evidence$6 - (undocumented)
Returns:
the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

runUntilConvergenceWithOptions

public static <VD,ED> Graph<Object,Object> runUntilConvergenceWithOptions(Graph<VD,ED> graph,
                                                                          double tol,
                                                                          double resetProb,
                                                                          scala.Option<Object> srcId,
                                                                          scala.reflect.ClassTag<VD> evidence$7,
                                                                          scala.reflect.ClassTag<ED> evidence$8)
Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

Parameters:
graph - the graph on which to compute PageRank
tol - the tolerance allowed at convergence (smaller => more accurate).
resetProb - the random reset probability (alpha)
srcId - the source vertex for a Personalized Page Rank (optional)

evidence$7 - (undocumented)
evidence$8 - (undocumented)
Returns:
the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.