org.apache.spark.graphx.GraphOps<VD,ED>

Type Parameters:: VD - the vertex attribute type; ED - the edge attribute type

All Implemented Interfaces:: Serializable, scala.Serializable

public class GraphOps<VD,ED> extends Object implements scala.Serializable

Contains additional functionality for Graph. All operations are expressed in terms of the efficient GraphX API. This class is implicitly constructed for each Graph object.

See Also:

Serialized Form

Constructor Summary

Constructors

Constructor

Description

GraphOps(Graph<VD,ED> graph, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
Method Summary

Modifier and Type

Method

Description

VertexRDD<Edge<ED>[]>

collectEdges(EdgeDirection edgeDirection)

Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.

VertexRDD<long[]>

collectNeighborIds(EdgeDirection edgeDirection)

Collect the neighbor vertex ids for each vertex.

VertexRDD<scala.Tuple2<Object,VD>[]>

collectNeighbors(EdgeDirection edgeDirection)

Collect the neighbor vertex attributes for each vertex.

Graph<Object,ED>

connectedComponents()

Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

Graph<Object,ED>

connectedComponents(int maxIterations)

Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

Graph<VD,ED>

convertToCanonicalEdges(scala.Function2<ED,ED,ED> mergeFunc)

Convert bi-directional edges into uni-directional ones.

VertexRDD<Object>

degrees()

<VD2, ED2> Graph<VD,ED>

filter(scala.Function1<Graph<VD,ED>,Graph<VD2,ED2>> preprocess, scala.Function1<EdgeTriplet<VD2,ED2>,Object> epred, scala.Function2<Object,VD2,Object> vpred, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<ED2> evidence$5)

Filter the graph by computing some values to filter on, and applying the predicates.

VertexRDD<Object>

inDegrees()

<U> Graph<VD,ED>

joinVertices(RDD<scala.Tuple2<Object,U>> table, scala.Function3<Object,VD,U,VD> mapFunc, scala.reflect.ClassTag<U> evidence$3)

Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value.

long

numEdges()

long

numVertices()

VertexRDD<Object>

outDegrees()

Graph<Object,Object>

pageRank(double tol, double resetProb)

Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

Graph<Object,Object>

personalizedPageRank(long src, double tol, double resetProb)

Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.

long

pickRandomVertex()

Picks a random vertex from the graph and returns its ID.

<A> Graph<VD,ED>

pregel(A initialMsg, int maxIterations, EdgeDirection activeDirection, scala.Function3<Object,VD,A,VD> vprog, scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg, scala.Function2<A,A,A> mergeMsg, scala.reflect.ClassTag<A> evidence$6)

Execute a Pregel-like iterative vertex-parallel abstraction.

Graph<VD,ED>

removeSelfEdges()

Remove self edges.

Graph<Object,Object>

staticPageRank(int numIter, double resetProb)

Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

Graph<Object,Object>

staticPageRank(int numIter, double resetProb, Graph<Object,Object> prePageRank)

Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight, optionally including including a previous pageRank computation to be used as a start point for the new iterations

Graph<Vector,Object>

staticParallelPersonalizedPageRank(long[] sources, int numIter, double resetProb)

Run parallel personalized PageRank for a given array of source vertices, such that all random walks are started relative to the source vertices

Graph<Object,Object>

staticPersonalizedPageRank(long src, int numIter, double resetProb)

Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

Graph<Object,ED>

stronglyConnectedComponents(int numIter)

Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.

Graph<Object,ED>

triangleCount()

Compute the number of triangles passing through each vertex.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- GraphOps
  
  public GraphOps(Graph<VD,ED> graph, scala.reflect.ClassTag<VD> evidence$1, scala.reflect.ClassTag<ED> evidence$2)
Method Details
- collectEdges
  
  public VertexRDD<Edge<ED>[]> collectEdges(EdgeDirection edgeDirection)
  
  Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction. Warning: note that singleton vertices, those with no edges in the given direction will not be part of the return value.
  
  Parameters:
  
  edgeDirection - the direction along which to collect the local edges of vertices
  
  Returns:
  
  the local edges for each vertex
  
  Note:
  
  This function could be highly inefficient on power-law graphs where high degree vertices may force a large amount of information to be collected to a single location.
- collectNeighborIds
  
  public VertexRDD<long[]> collectNeighborIds(EdgeDirection edgeDirection)
  
  Collect the neighbor vertex ids for each vertex.
  
  Parameters:
  
  edgeDirection - the direction along which to collect neighboring vertices
  
  Returns:
  
  the set of neighboring ids for each vertex
- collectNeighbors
  
  public VertexRDD<scala.Tuple2<Object,VD>[]> collectNeighbors(EdgeDirection edgeDirection)
  
  Collect the neighbor vertex attributes for each vertex.
  
  Parameters:
  
  edgeDirection - the direction along which to collect neighboring vertices
  
  Returns:
  
  the vertex set of neighboring vertex attributes for each vertex
  
  Note:
  
  This function could be highly inefficient on power-law graphs where high degree vertices may force a large amount of information to be collected to a single location.
- connectedComponents
  
  public Graph<Object,ED> connectedComponents()
  
  Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
  Returns:
  
  (undocumented)
  
  See Also:
  
  org.apache.spark.graphx.lib.ConnectedComponents.run
- connectedComponents
  
  public Graph<Object,ED> connectedComponents(int maxIterations)
  
  Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
  Parameters:
  
  maxIterations - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  org.apache.spark.graphx.lib.ConnectedComponents.run
- convertToCanonicalEdges
  
  public Graph<VD,ED> convertToCanonicalEdges(scala.Function2<ED,ED,ED> mergeFunc)
  
  Convert bi-directional edges into uni-directional ones. Some graph algorithms (e.g., TriangleCount) assume that an input graph has its edges in canonical direction. This function rewrites the vertex ids of edges so that srcIds are smaller than dstIds, and merges the duplicated edges.
  
  Parameters:
  
  mergeFunc - the user defined reduce function which should be commutative and associative and is used to combine the output of the map phase
  
  Returns:
  
  the resulting graph with canonical edges
- degrees
  
  public VertexRDD<Object> degrees()
- filter
  
  public <VD2, ED2> Graph<VD,ED> filter(scala.Function1<Graph<VD,ED>,Graph<VD2,ED2>> preprocess, scala.Function1<EdgeTriplet<VD2,ED2>,Object> epred, scala.Function2<Object,VD2,Object> vpred, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<ED2> evidence$5)
  
  Filter the graph by computing some values to filter on, and applying the predicates.
  Parameters:
  
  preprocess - a function to compute new vertex and edge data before filtering
  
  epred - edge pred to filter on after preprocess, see more details under Graph.subgraph(scala.Function1<org.apache.spark.graphx.EdgeTriplet<VD, ED>, java.lang.Object>, scala.Function2<java.lang.Object, VD, java.lang.Object>)
  
  vpred - vertex pred to filter on after preprocess, see more details under Graph.subgraph(scala.Function1<org.apache.spark.graphx.EdgeTriplet<VD, ED>, java.lang.Object>, scala.Function2<java.lang.Object, VD, java.lang.Object>)
  
  evidence$4 - (undocumented)
  
  evidence$5 - (undocumented)
  
  Returns:
  
  a subgraph of the original graph, with its data unchanged
  
  Example:
  
  This function can be used to filter the graph based on some property, without changing the vertex and edge values in your program. For example, we could remove the vertices in a graph with 0 outdegree
  
  graph.filter( graph => { val degrees: VertexRDD[Int] = graph.outDegrees graph.outerJoinVertices(degrees) {(vid, data, deg) => deg.getOrElse(0)} }, vpred = (vid: VertexId, deg:Int) => deg > 0 )
- inDegrees
  
  public VertexRDD<Object> inDegrees()
- joinVertices
  
  public <U> Graph<VD,ED> joinVertices(RDD<scala.Tuple2<Object,U>> table, scala.Function3<Object,VD,U,VD> mapFunc, scala.reflect.ClassTag<U> evidence$3)
  
  Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value. The input table should contain at most one entry for each vertex. If no entry is provided the map function is skipped and the old value is used.
  Parameters:
  
  table - the table to join with the vertices in the graph. The table should contain at most one entry for each vertex.
  
  mapFunc - the function used to compute the new vertex values. The map function is invoked only for vertices with a corresponding entry in the table otherwise the old vertex value is used.
  
  evidence$3 - (undocumented)
  
  Returns:
  
  (undocumented)
  
  Example:
  
  This function is used to update the vertices with new values based on external data. For example we could add the out degree to each vertex record
  
  val rawGraph: Graph[Int, Int] = GraphLoader.edgeListFile(sc, "webgraph") .mapVertices((_, _) => 0) val outDeg = rawGraph.outDegrees val graph = rawGraph.joinVertices[Int](outDeg) ((_, _, outDeg) => outDeg)
- numEdges
  
  public long numEdges()
- numVertices
  
  public long numVertices()
- outDegrees
  
  public VertexRDD<Object> outDegrees()
- pageRank
  
  public Graph<Object,Object> pageRank(double tol, double resetProb)
  
  Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.
  Parameters:
  
  tol - (undocumented)
  
  resetProb - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  PageRank$.runUntilConvergence(org.apache.spark.graphx.Graph<VD, ED>, double, double, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- personalizedPageRank
  
  public Graph<Object,Object> personalizedPageRank(long src, double tol, double resetProb)
  
  Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.
  Parameters:
  
  src - (undocumented)
  
  tol - (undocumented)
  
  resetProb - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  PageRank$.runUntilConvergenceWithOptions(org.apache.spark.graphx.Graph<VD, ED>, double, double, scala.Option<java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- pickRandomVertex
  
  public long pickRandomVertex()
  
  Picks a random vertex from the graph and returns its ID.
  
  Returns:
  
  (undocumented)
- pregel
  
  public <A> Graph<VD,ED> pregel(A initialMsg, int maxIterations, EdgeDirection activeDirection, scala.Function3<Object,VD,A,VD> vprog, scala.Function1<EdgeTriplet<VD,ED>,scala.collection.Iterator<scala.Tuple2<Object,A>>> sendMsg, scala.Function2<A,A,A> mergeMsg, scala.reflect.ClassTag<A> evidence$6)
  
  Execute a Pregel-like iterative vertex-parallel abstraction. The user-defined vertex-program vprog is executed in parallel on each vertex receiving any inbound messages and computing a new value for the vertex. The sendMsg function is then invoked on all out-edges and is used to compute an optional message to the destination vertex. The mergeMsg function is a commutative associative function used to combine messages destined to the same vertex.
  On the first iteration all vertices receive the initialMsg and on subsequent iterations if a vertex does not receive a message then the vertex-program is not invoked.
  This function iterates until there are no remaining messages, or for maxIterations iterations.
  
  Parameters:
  
  initialMsg - the message each vertex will receive at the on the first iteration
  
  maxIterations - the maximum number of iterations to run for
  
  activeDirection - the direction of edges incident to a vertex that received a message in the previous round on which to run sendMsg. For example, if this is EdgeDirection.Out, only out-edges of vertices that received a message in the previous round will run.
  
  vprog - the user-defined vertex program which runs on each vertex and receives the inbound message and computes a new vertex value. On the first iteration the vertex program is invoked on all vertices and is passed the default message. On subsequent iterations the vertex program is only invoked on those vertices that receive messages.
  
  sendMsg - a user supplied function that is applied to out edges of vertices that received messages in the current iteration
  
  mergeMsg - a user supplied function that takes two incoming messages of type A and merges them into a single message of type A. ''This function must be commutative and associative and ideally the size of A should not increase.''
  
  evidence$6 - (undocumented)
  
  Returns:
  
  the resulting graph at the end of the computation
- removeSelfEdges
  
  public Graph<VD,ED> removeSelfEdges()
  
  Remove self edges.
  
  Returns:
  
  a graph with all self edges removed
- staticPageRank
  
  public Graph<Object,Object> staticPageRank(int numIter, double resetProb)
  
  Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
  Parameters:
  
  numIter - (undocumented)
  
  resetProb - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  PageRank$.run(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- staticPageRank
  
  public Graph<Object,Object> staticPageRank(int numIter, double resetProb, Graph<Object,Object> prePageRank)
  
  Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight, optionally including including a previous pageRank computation to be used as a start point for the new iterations
  Parameters:
  
  numIter - (undocumented)
  
  resetProb - (undocumented)
  
  prePageRank - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  PageRank$.runWithOptionsWithPreviousPageRank(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.Option<java.lang.Object>, org.apache.spark.graphx.Graph<java.lang.Object, java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- staticParallelPersonalizedPageRank
  
  public Graph<Vector,Object> staticParallelPersonalizedPageRank(long[] sources, int numIter, double resetProb)
  
  Run parallel personalized PageRank for a given array of source vertices, such that all random walks are started relative to the source vertices
  
  Parameters:
  
  sources - (undocumented)
  
  numIter - (undocumented)
  
  resetProb - (undocumented)
  
  Returns:
  
  (undocumented)
- staticPersonalizedPageRank
  
  public Graph<Object,Object> staticPersonalizedPageRank(long src, int numIter, double resetProb)
  
  Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.
  Parameters:
  
  src - (undocumented)
  
  numIter - (undocumented)
  
  resetProb - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  PageRank$.runWithOptions(org.apache.spark.graphx.Graph<VD, ED>, int, double, scala.Option<java.lang.Object>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- stronglyConnectedComponents
  
  public Graph<Object,ED> stronglyConnectedComponents(int numIter)
  
  Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.
  Parameters:
  
  numIter - (undocumented)
  
  Returns:
  
  (undocumented)
  
  See Also:
  
  StronglyConnectedComponents$.run(org.apache.spark.graphx.Graph<VD, ED>, int, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)
- triangleCount
  
  public Graph<Object,ED> triangleCount()
  
  Compute the number of triangles passing through each vertex.
  Returns:
  
  (undocumented)
  
  See Also:
  
  TriangleCount$.run(org.apache.spark.graphx.Graph<VD, ED>, scala.reflect.ClassTag<VD>, scala.reflect.ClassTag<ED>)

Class GraphOps<VD,ED>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

GraphOps

Method Details

collectEdges

collectNeighborIds

collectNeighbors

connectedComponents

connectedComponents

convertToCanonicalEdges

degrees

filter

inDegrees

joinVertices

numEdges

numVertices

outDegrees

pageRank

personalizedPageRank

pickRandomVertex

pregel

removeSelfEdges

staticPageRank

staticPageRank

staticParallelPersonalizedPageRank

staticPersonalizedPageRank

stronglyConnectedComponents

triangleCount