Class PageRank

Object
org.apache.spark.graphx.lib.PageRank

public class PageRank extends Object
PageRank algorithm implementation. There are two implementations of PageRank implemented.

The first implementation uses the standalone Graph interface and runs PageRank for a fixed number of iterations:


 var PR = Array.fill(n)( 1.0 )
 val oldPR = Array.fill(n)( 1.0 )
 for( iter <- 0 until numIter ) {
   swap(oldPR, PR)
   for( i <- 0 until n ) {
     PR[i] = alpha + (1 - alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
   }
 }
 

The second implementation uses the Pregel interface and runs PageRank until convergence:


 var PR = Array.fill(n)( 1.0 )
 val oldPR = Array.fill(n)( 0.0 )
 while( max(abs(PR - oldPr)) > tol ) {
   swap(oldPR, PR)
   for( i <- 0 until n if abs(PR[i] - oldPR[i]) > tol ) {
     PR[i] = alpha + (1 - \alpha) * inNbrs[i].map(j => oldPR[j] / outDeg[j]).sum
   }
 }
 

alpha is the random reset probability (typically 0.15), inNbrs[i] is the set of neighbors which link to i and outDeg[j] is the out degree of vertex j.

Note:
This is not the "normalized" PageRank and as a consequence pages that have no inlinks will have a PageRank of alpha.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.spark.internal.Logging.LogStringContext
    LogStringContext(scala.StringContext sc)
     
    static org.slf4j.Logger
     
    static void
     
    static <VD, ED> Graph<Object,Object>
    run(Graph<VD,ED> graph, int numIter, double resetProb, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
    static <VD, ED> Graph<Vector,Object>
    runParallelPersonalizedPageRank(Graph<VD,ED> graph, int numIter, double resetProb, long[] sources, scala.reflect.ClassTag<VD> evidence$11, scala.reflect.ClassTag<ED> evidence$12)
    Run Personalized PageRank for a fixed number of iterations, for a set of starting nodes in parallel.
    static <VD, ED> Graph<Object,Object>
    runUntilConvergence(Graph<VD,ED> graph, double tol, double resetProb, scala.reflect.ClassTag<VD> evidence$13, scala.reflect.ClassTag<ED> evidence$14)
    Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
    static <VD, ED> Graph<Object,Object>
    runUntilConvergenceWithOptions(Graph<VD,ED> graph, double tol, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$15, scala.reflect.ClassTag<ED> evidence$16)
    Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
    static <VD, ED> Graph<Object,Object>
    runWithOptions(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, boolean normalized, scala.reflect.ClassTag<VD> evidence$5, scala.reflect.ClassTag<ED> evidence$6)
    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
    static <VD, ED> Graph<Object,Object>
    runWithOptions(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$3, scala.reflect.ClassTag<ED> evidence$4)
    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
    static <VD, ED> Graph<Object,Object>
    runWithOptionsWithPreviousPageRank(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, boolean normalized, Graph<Object,Object> preRankGraph, scala.reflect.ClassTag<VD> evidence$9, scala.reflect.ClassTag<ED> evidence$10)
    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
    static <VD, ED> Graph<Object,Object>
    runWithOptionsWithPreviousPageRank(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, Graph<Object,Object> preRankGraph, scala.reflect.ClassTag<VD> evidence$7, scala.reflect.ClassTag<ED> evidence$8)
    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

    Methods inherited from class java.lang.Object

    equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • PageRank

      public PageRank()
  • Method Details

    • run

      public static <VD, ED> Graph<Object,Object> run(Graph<VD,ED> graph, int numIter, double resetProb, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
      Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      numIter - the number of iterations of PageRank to run
      resetProb - the random reset probability (alpha)

      evidence$1 - (undocumented)
      evidence$2 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.
    • runWithOptions

      public static <VD, ED> Graph<Object,Object> runWithOptions(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$3, scala.reflect.ClassTag<ED> evidence$4)
      Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      numIter - the number of iterations of PageRank to run
      resetProb - the random reset probability (alpha)
      srcId - the source vertex for a Personalized Page Rank (optional)

      evidence$3 - (undocumented)
      evidence$4 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

    • runWithOptions

      public static <VD, ED> Graph<Object,Object> runWithOptions(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, boolean normalized, scala.reflect.ClassTag<VD> evidence$5, scala.reflect.ClassTag<ED> evidence$6)
      Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      numIter - the number of iterations of PageRank to run
      resetProb - the random reset probability (alpha)
      srcId - the source vertex for a Personalized Page Rank (optional)
      normalized - whether or not to normalize rank sum

      evidence$5 - (undocumented)
      evidence$6 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

      Since:
      3.2.0
    • runWithOptionsWithPreviousPageRank

      public static <VD, ED> Graph<Object,Object> runWithOptionsWithPreviousPageRank(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, Graph<Object,Object> preRankGraph, scala.reflect.ClassTag<VD> evidence$7, scala.reflect.ClassTag<ED> evidence$8)
      Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      numIter - the number of iterations of PageRank to run
      resetProb - the random reset probability (alpha)
      srcId - the source vertex for a Personalized Page Rank (optional)
      preRankGraph - PageRank graph from which to keep iterating

      evidence$7 - (undocumented)
      evidence$8 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

    • runWithOptionsWithPreviousPageRank

      public static <VD, ED> Graph<Object,Object> runWithOptionsWithPreviousPageRank(Graph<VD,ED> graph, int numIter, double resetProb, scala.Option<Object> srcId, boolean normalized, Graph<Object,Object> preRankGraph, scala.reflect.ClassTag<VD> evidence$9, scala.reflect.ClassTag<ED> evidence$10)
      Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      numIter - the number of iterations of PageRank to run
      resetProb - the random reset probability (alpha)
      srcId - the source vertex for a Personalized Page Rank (optional)
      normalized - whether or not to normalize rank sum
      preRankGraph - PageRank graph from which to keep iterating

      evidence$9 - (undocumented)
      evidence$10 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.

      Since:
      3.2.0
    • runParallelPersonalizedPageRank

      public static <VD, ED> Graph<Vector,Object> runParallelPersonalizedPageRank(Graph<VD,ED> graph, int numIter, double resetProb, long[] sources, scala.reflect.ClassTag<VD> evidence$11, scala.reflect.ClassTag<ED> evidence$12)
      Run Personalized PageRank for a fixed number of iterations, for a set of starting nodes in parallel. Returns a graph with vertex attributes containing the pagerank relative to all starting nodes (as a sparse vector) and edge attributes the normalized edge weight

      Parameters:
      graph - The graph on which to compute personalized pagerank
      numIter - The number of iterations to run
      resetProb - The random reset probability
      sources - The list of sources to compute personalized pagerank from
      evidence$11 - (undocumented)
      evidence$12 - (undocumented)
      Returns:
      the graph with vertex attributes containing the pagerank relative to all starting nodes (as a sparse vector indexed by the position of nodes in the sources list) and edge attributes the normalized edge weight
    • runUntilConvergence

      public static <VD, ED> Graph<Object,Object> runUntilConvergence(Graph<VD,ED> graph, double tol, double resetProb, scala.reflect.ClassTag<VD> evidence$13, scala.reflect.ClassTag<ED> evidence$14)
      Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      tol - the tolerance allowed at convergence (smaller => more accurate).
      resetProb - the random reset probability (alpha)

      evidence$13 - (undocumented)
      evidence$14 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.
    • runUntilConvergenceWithOptions

      public static <VD, ED> Graph<Object,Object> runUntilConvergenceWithOptions(Graph<VD,ED> graph, double tol, double resetProb, scala.Option<Object> srcId, scala.reflect.ClassTag<VD> evidence$15, scala.reflect.ClassTag<ED> evidence$16)
      Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

      Parameters:
      graph - the graph on which to compute PageRank
      tol - the tolerance allowed at convergence (smaller => more accurate).
      resetProb - the random reset probability (alpha)
      srcId - the source vertex for a Personalized Page Rank (optional)

      evidence$15 - (undocumented)
      evidence$16 - (undocumented)
      Returns:
      the graph containing with each vertex containing the PageRank and each edge containing the normalized weight.
    • org$apache$spark$internal$Logging$$log_

      public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
    • org$apache$spark$internal$Logging$$log__$eq

      public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
    • LogStringContext

      public static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc)