VD
- the vertex attribute typeED
- the edge attribute typepublic class GraphOps<VD,ED>
extends Object
implements scala.Serializable
Graph
. All operations are expressed in terms of the
efficient GraphX API. This class is implicitly constructed for each Graph object.
Constructor and Description |
---|
GraphOps(Graph<VD,ED> graph,
scala.reflect.ClassTag<VD> evidence$1,
scala.reflect.ClassTag<ED> evidence$2) |
Modifier and Type | Method and Description |
---|---|
VertexRDD<Edge<ED>[]> |
collectEdges(EdgeDirection edgeDirection)
Returns an RDD that contains for each vertex v its local edges,
i.e., the edges that are incident on v, in the user-specified direction.
|
VertexRDD<long[]> |
collectNeighborIds(EdgeDirection edgeDirection)
Collect the neighbor vertex ids for each vertex.
|
VertexRDD<scala.Tuple2<Object,VD>[]> |
collectNeighbors(EdgeDirection edgeDirection)
Collect the neighbor vertex attributes for each vertex.
|
Graph<Object,ED> |
connectedComponents()
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
|
Graph<Object,ED> |
connectedComponents(int maxIterations)
Compute the connected component membership of each vertex and return a graph with the vertex
value containing the lowest vertex id in the connected component containing that vertex.
|
Graph<VD,ED> |
convertToCanonicalEdges(scala.Function2<ED,ED,ED> mergeFunc)
Convert bi-directional edges into uni-directional ones.
|
VertexRDD<Object> |
degrees() |
<VD2,ED2> Graph<VD,ED> |
filter(scala.Function1<Graph<VD,ED>,Graph<VD2,ED2>> preprocess,
scala.Function1<EdgeTriplet<VD2,ED2>,Object> epred,
scala.Function2<Object,VD2,Object> vpred,
scala.reflect.ClassTag<VD2> evidence$4,
scala.reflect.ClassTag<ED2> evidence$5)
Filter the graph by computing some values to filter on, and applying the predicates.
|
VertexRDD<Object> |
inDegrees() |
<U> Graph<VD,ED> |
joinVertices(RDD<scala.Tuple2<Object,U>> table,
scala.Function3<Object,VD,U,VD> mapFunc,
scala.reflect.ClassTag<U> evidence$3)
Join the vertices with an RDD and then apply a function from the
vertex and RDD entry to a new vertex value.
|
long |
numEdges() |
long |
numVertices() |
VertexRDD<Object> |
outDegrees() |
Graph<Object,Object> |
pageRank(double tol,
double resetProb)
Run a dynamic version of PageRank returning a graph with vertex attributes containing the
PageRank and edge attributes containing the normalized edge weight.
|
Graph<Object,Object> |
personalizedPageRank(long src,
double tol,
double resetProb)
Run personalized PageRank for a given vertex, such that all random walks
are started relative to the source node.
|
long |
pickRandomVertex()
Picks a random vertex from the graph and returns its ID.
|
<A> Graph<VD,ED> |
pregel(A initialMsg,
int maxIterations,
EdgeDirection activeDirection,
scala.Function3<Object,VD,A,VD> vprog,
scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg,
scala.Function2<A,A,A> mergeMsg,
scala.reflect.ClassTag<A> evidence$6)
Execute a Pregel-like iterative vertex-parallel abstraction.
|
Graph<VD,ED> |
removeSelfEdges()
Remove self edges.
|
Graph<Object,Object> |
staticPageRank(int numIter,
double resetProb)
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
|
Graph<Object,Object> |
staticPageRank(int numIter,
double resetProb,
Graph<Object,Object> prePageRank)
Run PageRank for a fixed number of iterations returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight, optionally including
including a previous pageRank computation to be used as a start point for the new iterations
|
Graph<Vector,Object> |
staticParallelPersonalizedPageRank(long[] sources,
int numIter,
double resetProb)
Run parallel personalized PageRank for a given array of source vertices, such
that all random walks are started relative to the source vertices
|
Graph<Object,Object> |
staticPersonalizedPageRank(long src,
int numIter,
double resetProb)
Run Personalized PageRank for a fixed number of iterations with
with all iterations originating at the source node
returning a graph with vertex attributes
containing the PageRank and edge attributes the normalized edge weight.
|
Graph<Object,ED> |
stronglyConnectedComponents(int numIter)
Compute the strongly connected component (SCC) of each vertex and return a graph with the
vertex value containing the lowest vertex id in the SCC containing that vertex.
|
Graph<Object,ED> |
triangleCount()
Compute the number of triangles passing through each vertex.
|
public VertexRDD<Edge<ED>[]> collectEdges(EdgeDirection edgeDirection)
edgeDirection
- the direction along which to collect
the local edges of vertices
public VertexRDD<long[]> collectNeighborIds(EdgeDirection edgeDirection)
edgeDirection
- the direction along which to collect
neighboring vertices
public VertexRDD<scala.Tuple2<Object,VD>[]> collectNeighbors(EdgeDirection edgeDirection)
edgeDirection
- the direction along which to collect
neighboring vertices
public Graph<Object,ED> connectedComponents()
org.apache.spark.graphx.lib.ConnectedComponents.run
public Graph<Object,ED> connectedComponents(int maxIterations)
maxIterations
- (undocumented)org.apache.spark.graphx.lib.ConnectedComponents.run
public Graph<VD,ED> convertToCanonicalEdges(scala.Function2<ED,ED,ED> mergeFunc)
mergeFunc
- the user defined reduce function which should
be commutative and associative and is used to combine the output
of the map phase
public VertexRDD<Object> degrees()
public <VD2,ED2> Graph<VD,ED> filter(scala.Function1<Graph<VD,ED>,Graph<VD2,ED2>> preprocess, scala.Function1<EdgeTriplet<VD2,ED2>,Object> epred, scala.Function2<Object,VD2,Object> vpred, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<ED2> evidence$5)
preprocess
- a function to compute new vertex and edge data before filteringepred
- edge pred to filter on after preprocess, see more details under
Graph.subgraph(scala.Function1<org.apache.spark.graphx.EdgeTriplet<VD, ED>, java.lang.Object>, scala.Function2<java.lang.Object, VD, java.lang.Object>)
vpred
- vertex pred to filter on after preprocess, see more details under
Graph.subgraph(scala.Function1<org.apache.spark.graphx.EdgeTriplet<VD, ED>, java.lang.Object>, scala.Function2<java.lang.Object, VD, java.lang.Object>)
evidence$4
- (undocumented)evidence$5
- (undocumented)
graph.filter(
graph => {
val degrees: VertexRDD[Int] = graph.outDegrees
graph.outerJoinVertices(degrees) {(vid, data, deg) => deg.getOrElse(0)}
},
vpred = (vid: VertexId, deg:Int) => deg > 0
)
public VertexRDD<Object> inDegrees()
public <U> Graph<VD,ED> joinVertices(RDD<scala.Tuple2<Object,U>> table, scala.Function3<Object,VD,U,VD> mapFunc, scala.reflect.ClassTag<U> evidence$3)
table
- the table to join with the vertices in the graph.
The table should contain at most one entry for each vertex.mapFunc
- the function used to compute the new vertex
values. The map function is invoked only for vertices with a
corresponding entry in the table otherwise the old vertex value
is used.
evidence$3
- (undocumented)
val rawGraph: Graph[Int, Int] = GraphLoader.edgeListFile(sc, "webgraph")
.mapVertices((_, _) => 0)
val outDeg = rawGraph.outDegrees
val graph = rawGraph.joinVertices[Int](outDeg)
((_, _, outDeg) => outDeg)
public long numEdges()
public long numVertices()
public VertexRDD<Object> outDegrees()
public Graph<Object,Object> pageRank(double tol, double resetProb)
tol
- (undocumented)resetProb
- (undocumented)PageRank$.runUntilConvergence(org.apache.spark.graphx.Graph<VD, ED>, double, double, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
public Graph<Object,Object> personalizedPageRank(long src, double tol, double resetProb)
src
- (undocumented)tol
- (undocumented)resetProb
- (undocumented)PageRank$.runUntilConvergenceWithOptions(org.apache.spark.graphx.Graph<VD, ED>, double, double, scala.Option<java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
public long pickRandomVertex()
public <A> Graph<VD,ED> pregel(A initialMsg, int maxIterations, EdgeDirection activeDirection, scala.Function3<Object,VD,A,VD> vprog, scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg, scala.Function2<A,A,A> mergeMsg, scala.reflect.ClassTag<A> evidence$6)
vprog
is executed in parallel on
each vertex receiving any inbound messages and computing a new
value for the vertex. The sendMsg
function is then invoked on
all out-edges and is used to compute an optional message to the
destination vertex. The mergeMsg
function is a commutative
associative function used to combine messages destined to the
same vertex.
On the first iteration all vertices receive the initialMsg
and
on subsequent iterations if a vertex does not receive a message
then the vertex-program is not invoked.
This function iterates until there are no remaining messages, or
for maxIterations
iterations.
initialMsg
- the message each vertex will receive at the on
the first iteration
maxIterations
- the maximum number of iterations to run for
activeDirection
- the direction of edges incident to a vertex that received a message in
the previous round on which to run sendMsg
. For example, if this is EdgeDirection.Out
, only
out-edges of vertices that received a message in the previous round will run.
vprog
- the user-defined vertex program which runs on each
vertex and receives the inbound message and computes a new vertex
value. On the first iteration the vertex program is invoked on
all vertices and is passed the default message. On subsequent
iterations the vertex program is only invoked on those vertices
that receive messages.
sendMsg
- a user supplied function that is applied to out
edges of vertices that received messages in the current
iteration
mergeMsg
- a user supplied function that takes two incoming
messages of type A and merges them into a single message of type
A. ''This function must be commutative and associative and
ideally the size of A should not increase.''
evidence$6
- (undocumented)public Graph<VD,ED> removeSelfEdges()
public Graph<Object,Object> staticPageRank(int numIter, double resetProb)
numIter
- (undocumented)resetProb
- (undocumented)PageRank$.run(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
public Graph<Object,Object> staticPageRank(int numIter, double resetProb, Graph<Object,Object> prePageRank)
numIter
- (undocumented)resetProb
- (undocumented)prePageRank
- (undocumented)PageRank$.runWithOptionsWithPreviousPageRank(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.Option<java.lang.Object>, org.apache.spark.graphx.Graph<java.lang.Object, java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
public Graph<Vector,Object> staticParallelPersonalizedPageRank(long[] sources, int numIter, double resetProb)
sources
- (undocumented)numIter
- (undocumented)resetProb
- (undocumented)public Graph<Object,Object> staticPersonalizedPageRank(long src, int numIter, double resetProb)
src
- (undocumented)numIter
- (undocumented)resetProb
- (undocumented)PageRank$.runWithOptions(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.Option<java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
public Graph<Object,ED> stronglyConnectedComponents(int numIter)
numIter
- (undocumented)StronglyConnectedComponents$.run(org.apache.spark.graphx.Graph<VD, ED>, int, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)