public class Pregel
extends Object
Unlike the original Pregel API, the GraphX Pregel API factors the sendMessage computation over edges, enables the message sending computation to read both vertex attributes, and constrains messages to the graph structure. These changes allow for substantially more efficient distributed execution while also exposing greater flexibility for graph-based computation.
val pagerankGraph: Graph[Double, Double] = graph
// Associate the degree with each vertex
.outerJoinVertices(graph.outDegrees) {
(vid, vdata, deg) => deg.getOrElse(0)
}
// Set the weight on the edges based on the degree
.mapTriplets(e => 1.0 / e.srcAttr)
// Set the vertex attributes to the initial pagerank values
.mapVertices((id, attr) => 1.0)
def vertexProgram(id: VertexId, attr: Double, msgSum: Double): Double =
resetProb + (1.0 - resetProb) * msgSum
def sendMessage(id: VertexId, edge: EdgeTriplet[Double, Double]): Iterator[(VertexId, Double)] =
Iterator((edge.dstId, edge.srcAttr * edge.attr))
def messageCombiner(a: Double, b: Double): Double = a + b
val initialMessage = 0.0
// Execute Pregel for a fixed number of iterations.
Pregel(pagerankGraph, initialMessage, numIter)(
vertexProgram, sendMessage, messageCombiner)
Constructor and Description |
---|
Pregel() |
Modifier and Type | Method and Description |
---|---|
static <VD,ED,A> Graph<VD,ED> |
apply(Graph<VD,ED> graph,
A initialMsg,
int maxIterations,
EdgeDirection activeDirection,
scala.Function3<Object,VD,A,VD> vprog,
scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg,
scala.Function2<A,A,A> mergeMsg,
scala.reflect.ClassTag<VD> evidence$1,
scala.reflect.ClassTag<ED> evidence$2,
scala.reflect.ClassTag<A> evidence$3)
Execute a Pregel-like iterative vertex-parallel abstraction.
|
static void |
org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) |
static org.slf4j.Logger |
org$apache$spark$internal$Logging$$log_() |
public static <VD,ED,A> Graph<VD,ED> apply(Graph<VD,ED> graph, A initialMsg, int maxIterations, EdgeDirection activeDirection, scala.Function3<Object,VD,A,VD> vprog, scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg, scala.Function2<A,A,A> mergeMsg, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2, scala.reflect.ClassTag<A> evidence$3)
vprog
is executed in parallel on
each vertex receiving any inbound messages and computing a new
value for the vertex. The sendMsg
function is then invoked on
all out-edges and is used to compute an optional message to the
destination vertex. The mergeMsg
function is a commutative
associative function used to combine messages destined to the
same vertex.
On the first iteration all vertices receive the initialMsg
and
on subsequent iterations if a vertex does not receive a message
then the vertex-program is not invoked.
This function iterates until there are no remaining messages, or
for maxIterations
iterations.
graph
- the input graph.
initialMsg
- the message each vertex will receive at the first
iteration
maxIterations
- the maximum number of iterations to run for
activeDirection
- the direction of edges incident to a vertex that received a message in
the previous round on which to run sendMsg
. For example, if this is EdgeDirection.Out
, only
out-edges of vertices that received a message in the previous round will run. The default is
EdgeDirection.Either
, which will run sendMsg
on edges where either side received a message
in the previous round. If this is EdgeDirection.Both
, sendMsg
will only run on edges where
*both* vertices received a message.
vprog
- the user-defined vertex program which runs on each
vertex and receives the inbound message and computes a new vertex
value. On the first iteration the vertex program is invoked on
all vertices and is passed the default message. On subsequent
iterations the vertex program is only invoked on those vertices
that receive messages.
sendMsg
- a user supplied function that is applied to out
edges of vertices that received messages in the current
iteration
mergeMsg
- a user supplied function that takes two incoming
messages of type A and merges them into a single message of type
A. ''This function must be commutative and associative and
ideally the size of A should not increase.''
evidence$1
- (undocumented)evidence$2
- (undocumented)evidence$3
- (undocumented)public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)