Class

org.apache.spark.graphx

GraphOps

Related Doc: package graphx

Permalink

class GraphOps[VD, ED] extends Serializable

Contains additional functionality for Graph. All operations are expressed in terms of the efficient GraphX API. This class is implicitly constructed for each Graph object.

VD

the vertex attribute type

ED

the edge attribute type

Source
GraphOps.scala
Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GraphOps
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GraphOps(graph: Graph[VD, ED])(implicit arg0: ClassTag[VD], arg1: ClassTag[ED])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def collectEdges(edgeDirection: EdgeDirection): VertexRDD[Array[Edge[ED]]]

    Permalink

    Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.

    Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction. Warning: note that singleton vertices, those with no edges in the given direction will not be part of the return value.

    edgeDirection

    the direction along which to collect the local edges of vertices

    returns

    the local edges for each vertex

    Note

    This function could be highly inefficient on power-law graphs where high degree vertices may force a large amount of information to be collected to a single location.

  7. def collectNeighborIds(edgeDirection: EdgeDirection): VertexRDD[Array[VertexId]]

    Permalink

    Collect the neighbor vertex ids for each vertex.

    Collect the neighbor vertex ids for each vertex.

    edgeDirection

    the direction along which to collect neighboring vertices

    returns

    the set of neighboring ids for each vertex

  8. def collectNeighbors(edgeDirection: EdgeDirection): VertexRDD[Array[(VertexId, VD)]]

    Permalink

    Collect the neighbor vertex attributes for each vertex.

    Collect the neighbor vertex attributes for each vertex.

    edgeDirection

    the direction along which to collect neighboring vertices

    returns

    the vertex set of neighboring vertex attributes for each vertex

    Note

    This function could be highly inefficient on power-law graphs where high degree vertices may force a large amount of information to be collected to a single location.

  9. def connectedComponents(maxIterations: Int): Graph[VertexId, ED]

    Permalink

    Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

    Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

    See also

    org.apache.spark.graphx.lib.ConnectedComponents$#run

  10. def connectedComponents(): Graph[VertexId, ED]

    Permalink

    Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

    Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.

    See also

    org.apache.spark.graphx.lib.ConnectedComponents$#run

  11. def convertToCanonicalEdges(mergeFunc: (ED, ED) ⇒ ED = (e1, e2) => e1): Graph[VD, ED]

    Permalink

    Convert bi-directional edges into uni-directional ones.

    Convert bi-directional edges into uni-directional ones. Some graph algorithms (e.g., TriangleCount) assume that an input graph has its edges in canonical direction. This function rewrites the vertex ids of edges so that srcIds are smaller than dstIds, and merges the duplicated edges.

    mergeFunc

    the user defined reduce function which should be commutative and associative and is used to combine the output of the map phase

    returns

    the resulting graph with canonical edges

  12. lazy val degrees: VertexRDD[Int]

    Permalink

    The degree of each vertex in the graph.

    The degree of each vertex in the graph.

    Note

    Vertices with no edges are not returned in the resulting RDD.

  13. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  15. def filter[VD2, ED2](preprocess: (Graph[VD, ED]) ⇒ Graph[VD2, ED2], epred: (EdgeTriplet[VD2, ED2]) ⇒ Boolean = (x: EdgeTriplet[VD2, ED2]) => true, vpred: (VertexId, VD2) ⇒ Boolean = (v: VertexId, d: VD2) => true)(implicit arg0: ClassTag[VD2], arg1: ClassTag[ED2]): Graph[VD, ED]

    Permalink

    Filter the graph by computing some values to filter on, and applying the predicates.

    Filter the graph by computing some values to filter on, and applying the predicates.

    VD2

    vertex type the vpred operates on

    ED2

    edge type the epred operates on

    preprocess

    a function to compute new vertex and edge data before filtering

    epred

    edge pred to filter on after preprocess, see more details under org.apache.spark.graphx.Graph#subgraph

    vpred

    vertex pred to filter on after preprocess, see more details under org.apache.spark.graphx.Graph#subgraph

    returns

    a subgraph of the original graph, with its data unchanged

    Example:
    1. This function can be used to filter the graph based on some property, without changing the vertex and edge values in your program. For example, we could remove the vertices in a graph with 0 outdegree

      graph.filter(
        graph => {
          val degrees: VertexRDD[Int] = graph.outDegrees
          graph.outerJoinVertices(degrees) {(vid, data, deg) => deg.getOrElse(0)}
        },
        vpred = (vid: VertexId, deg:Int) => deg > 0
      )
  16. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  19. lazy val inDegrees: VertexRDD[Int]

    Permalink

    The in-degree of each vertex in the graph.

    The in-degree of each vertex in the graph.

    Note

    Vertices with no in-edges are not returned in the resulting RDD.

  20. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  21. def joinVertices[U](table: RDD[(VertexId, U)])(mapFunc: (VertexId, VD, U) ⇒ VD)(implicit arg0: ClassTag[U]): Graph[VD, ED]

    Permalink

    Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value.

    Join the vertices with an RDD and then apply a function from the vertex and RDD entry to a new vertex value. The input table should contain at most one entry for each vertex. If no entry is provided the map function is skipped and the old value is used.

    U

    the type of entry in the table of updates

    table

    the table to join with the vertices in the graph. The table should contain at most one entry for each vertex.

    mapFunc

    the function used to compute the new vertex values. The map function is invoked only for vertices with a corresponding entry in the table otherwise the old vertex value is used.

    Example:
    1. This function is used to update the vertices with new values based on external data. For example we could add the out degree to each vertex record

      val rawGraph: Graph[Int, Int] = GraphLoader.edgeListFile(sc, "webgraph")
        .mapVertices((_, _) => 0)
      val outDeg = rawGraph.outDegrees
      val graph = rawGraph.joinVertices[Int](outDeg)
        ((_, _, outDeg) => outDeg)
  22. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. lazy val numEdges: Long

    Permalink

    The number of edges in the graph.

  26. lazy val numVertices: Long

    Permalink

    The number of vertices in the graph.

  27. lazy val outDegrees: VertexRDD[Int]

    Permalink

    The out-degree of each vertex in the graph.

    The out-degree of each vertex in the graph.

    Note

    Vertices with no out-edges are not returned in the resulting RDD.

  28. def pageRank(tol: Double, resetProb: Double = 0.15): Graph[Double, Double]

    Permalink

    Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

    Run a dynamic version of PageRank returning a graph with vertex attributes containing the PageRank and edge attributes containing the normalized edge weight.

    See also

    org.apache.spark.graphx.lib.PageRank$#runUntilConvergence

  29. def personalizedPageRank(src: VertexId, tol: Double, resetProb: Double = 0.15): Graph[Double, Double]

    Permalink

    Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.

    Run personalized PageRank for a given vertex, such that all random walks are started relative to the source node.

    See also

    org.apache.spark.graphx.lib.PageRank$#runUntilConvergenceWithOptions

  30. def pickRandomVertex(): VertexId

    Permalink

    Picks a random vertex from the graph and returns its ID.

  31. def pregel[A](initialMsg: A, maxIterations: Int = Int.MaxValue, activeDirection: EdgeDirection = EdgeDirection.Either)(vprog: (VertexId, VD, A) ⇒ VD, sendMsg: (EdgeTriplet[VD, ED]) ⇒ Iterator[(VertexId, A)], mergeMsg: (A, A) ⇒ A)(implicit arg0: ClassTag[A]): Graph[VD, ED]

    Permalink

    Execute a Pregel-like iterative vertex-parallel abstraction.

    Execute a Pregel-like iterative vertex-parallel abstraction. The user-defined vertex-program vprog is executed in parallel on each vertex receiving any inbound messages and computing a new value for the vertex. The sendMsg function is then invoked on all out-edges and is used to compute an optional message to the destination vertex. The mergeMsg function is a commutative associative function used to combine messages destined to the same vertex.

    On the first iteration all vertices receive the initialMsg and on subsequent iterations if a vertex does not receive a message then the vertex-program is not invoked.

    This function iterates until there are no remaining messages, or for maxIterations iterations.

    A

    the Pregel message type

    initialMsg

    the message each vertex will receive at the on the first iteration

    maxIterations

    the maximum number of iterations to run for

    activeDirection

    the direction of edges incident to a vertex that received a message in the previous round on which to run sendMsg. For example, if this is EdgeDirection.Out, only out-edges of vertices that received a message in the previous round will run.

    vprog

    the user-defined vertex program which runs on each vertex and receives the inbound message and computes a new vertex value. On the first iteration the vertex program is invoked on all vertices and is passed the default message. On subsequent iterations the vertex program is only invoked on those vertices that receive messages.

    sendMsg

    a user supplied function that is applied to out edges of vertices that received messages in the current iteration

    mergeMsg

    a user supplied function that takes two incoming messages of type A and merges them into a single message of type A. This function must be commutative and associative and ideally the size of A should not increase.

    returns

    the resulting graph at the end of the computation

  32. def removeSelfEdges(): Graph[VD, ED]

    Permalink

    Remove self edges.

    Remove self edges.

    returns

    a graph with all self edges removed

  33. def staticPageRank(numIter: Int, resetProb: Double = 0.15): Graph[Double, Double]

    Permalink

    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

    Run PageRank for a fixed number of iterations returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

    See also

    org.apache.spark.graphx.lib.PageRank$#run

  34. def staticParallelPersonalizedPageRank(sources: Array[VertexId], numIter: Int, resetProb: Double = 0.15): Graph[Vector, Double]

    Permalink

    Run parallel personalized PageRank for a given array of source vertices, such that all random walks are started relative to the source vertices

  35. def staticPersonalizedPageRank(src: VertexId, numIter: Int, resetProb: Double = 0.15): Graph[Double, Double]

    Permalink

    Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

    Run Personalized PageRank for a fixed number of iterations with with all iterations originating at the source node returning a graph with vertex attributes containing the PageRank and edge attributes the normalized edge weight.

    See also

    org.apache.spark.graphx.lib.PageRank$#runWithOptions

  36. def stronglyConnectedComponents(numIter: Int): Graph[VertexId, ED]

    Permalink

    Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.

    Compute the strongly connected component (SCC) of each vertex and return a graph with the vertex value containing the lowest vertex id in the SCC containing that vertex.

    See also

    org.apache.spark.graphx.lib.StronglyConnectedComponents$#run

  37. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  38. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  39. def triangleCount(): Graph[Int, ED]

    Permalink

    Compute the number of triangles passing through each vertex.

    Compute the number of triangles passing through each vertex.

    See also

    org.apache.spark.graphx.lib.TriangleCount$#run

  40. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped