org.apache.spark.graphx

VertexRDD

class VertexRDD[VD] extends RDD[(VertexId, VD)]

Extends RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by pre-indexing the entries for fast, efficient joins. Two VertexRDDs with the same index can be joined efficiently. All operations except reindex preserve the index. To construct a VertexRDD, use the VertexRDD object.

VD

the vertex attribute associated with each vertex in the set.

Example:
  1. Construct a VertexRDD from a plain RDD:

    // Construct an initial vertex set
    val someData: RDD[(VertexId, SomeType)] = loadData(someFile)
    val vset = VertexRDD(someData)
    // If there were redundant values in someData we would use a reduceFunc
    val vset2 = VertexRDD(someData, reduceFunc)
    // Finally we can use the VertexRDD to index another dataset
    val otherData: RDD[(VertexId, OtherType)] = loadData(otherFile)
    val vset3 = vset2.innerJoin(otherData) { (vid, a, b) => b }
    // Now we can construct very fast joins between the two sets
    val vset4: VertexRDD[(SomeType, OtherType)] = vset.leftJoin(vset3)
Linear Supertypes
RDD[(VertexId, VD)], Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. VertexRDD
  2. RDD
  3. Logging
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new VertexRDD(partitionsRDD: RDD[VertexPartition[VD]])(implicit arg0: ClassTag[VD])

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. def ++(other: RDD[(VertexId, VD)]): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  5. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  6. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  7. def aggregate[U](zeroValue: U)(seqOp: (U, (VertexId, VD)) ⇒ U, combOp: (U, U) ⇒ U)(implicit arg0: ClassTag[U]): U

    Definition Classes
    RDD
  8. def aggregateUsingIndex[VD2](messages: RDD[(VertexId, VD2)], reduceFunc: (VD2, VD2) ⇒ VD2)(implicit arg0: ClassTag[VD2]): VertexRDD[VD2]

    Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.

    Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.

    messages

    an RDD containing messages to aggregate, where each message is a pair of its target vertex ID and the message data

    reduceFunc

    the associative aggregation function for merging messages to the same vertex

    returns

    a VertexRDD co-indexed with this, containing only vertices that received messages. For those vertices, their values are the result of applying reduceFunc to all received messages.

  9. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  10. def cache(): VertexRDD[VD]

    Persist this RDD with the default storage level (MEMORY_ONLY).

    Persist this RDD with the default storage level (MEMORY_ONLY).

    Definition Classes
    VertexRDD → RDD
  11. def cartesian[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[((VertexId, VD), U)]

    Definition Classes
    RDD
  12. def checkpoint(): Unit

    Definition Classes
    RDD
  13. def clearDependencies(): Unit

    Attributes
    protected
    Definition Classes
    RDD
  14. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  15. def coalesce(numPartitions: Int, shuffle: Boolean): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  16. def collect[U](f: PartialFunction[(VertexId, VD), U])(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  17. def collect(): Array[(VertexId, VD)]

    Definition Classes
    RDD
  18. def compute(part: Partition, context: TaskContext): Iterator[(VertexId, VD)]

    Provides the RDD[(VertexId, VD)] equivalent output.

    Provides the RDD[(VertexId, VD)] equivalent output.

    Definition Classes
    VertexRDD → RDD
  19. def context: SparkContext

    Definition Classes
    RDD
  20. def count(): Long

    The number of vertices in the RDD.

    The number of vertices in the RDD.

    Definition Classes
    VertexRDD → RDD
  21. def countApprox(timeout: Long, confidence: Double): PartialResult[BoundedDouble]

    Definition Classes
    RDD
  22. def countApproxDistinct(relativeSD: Double): Long

    Definition Classes
    RDD
  23. def countByValue(): Map[(VertexId, VD), Long]

    Definition Classes
    RDD
  24. def countByValueApprox(timeout: Long, confidence: Double): PartialResult[Map[(VertexId, VD), BoundedDouble]]

    Definition Classes
    RDD
  25. final def dependencies: Seq[Dependency[_]]

    Definition Classes
    RDD
  26. def diff(other: VertexRDD[VD]): VertexRDD[VD]

    Hides vertices that are the same between this and other; for vertices that are different, keeps the values from other.

  27. def distinct(): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  28. def distinct(numPartitions: Int): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  29. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  30. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  31. def filter(pred: ((VertexId, VD)) ⇒ Boolean): VertexRDD[VD]

    Restricts the vertex set to the set of vertices satisfying the given predicate.

    Restricts the vertex set to the set of vertices satisfying the given predicate. This operation preserves the index for efficient joins with the original RDD, and it sets bits in the bitmask rather than allocating new memory.

    pred

    the user defined predicate, which takes a tuple to conform to the RDD[(VertexId, VD)] interface

    Definition Classes
    VertexRDD → RDD
  32. def filterWith[A](constructA: (Int) ⇒ A)(p: ((VertexId, VD), A) ⇒ Boolean)(implicit arg0: ClassTag[A]): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  33. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  34. def first(): (VertexId, VD)

    Definition Classes
    RDD
  35. def firstParent[U](implicit arg0: ClassTag[U]): RDD[U]

    Attributes
    protected[org.apache.spark]
    Definition Classes
    RDD
  36. def flatMap[U](f: ((VertexId, VD)) ⇒ TraversableOnce[U])(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  37. def flatMapWith[A, U](constructA: (Int) ⇒ A, preservesPartitioning: Boolean)(f: ((VertexId, VD), A) ⇒ Seq[U])(implicit arg0: ClassTag[A], arg1: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  38. def fold(zeroValue: (VertexId, VD))(op: ((VertexId, VD), (VertexId, VD)) ⇒ (VertexId, VD)): (VertexId, VD)

    Definition Classes
    RDD
  39. def foreach(f: ((VertexId, VD)) ⇒ Unit): Unit

    Definition Classes
    RDD
  40. def foreachPartition(f: (Iterator[(VertexId, VD)]) ⇒ Unit): Unit

    Definition Classes
    RDD
  41. def foreachWith[A](constructA: (Int) ⇒ A)(f: ((VertexId, VD), A) ⇒ Unit)(implicit arg0: ClassTag[A]): Unit

    Definition Classes
    RDD
  42. var generator: String

    Definition Classes
    RDD
  43. def getCheckpointFile: Option[String]

    Definition Classes
    RDD
  44. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  45. def getDependencies: Seq[Dependency[_]]

    Attributes
    protected
    Definition Classes
    RDD
  46. def getPartitions: Array[Partition]

    Attributes
    protected
    Definition Classes
    VertexRDD → RDD
  47. def getPreferredLocations(s: Partition): Seq[String]

    Attributes
    protected
    Definition Classes
    VertexRDD → RDD
  48. def getStorageLevel: StorageLevel

    Definition Classes
    RDD
  49. def glom(): RDD[Array[(VertexId, VD)]]

    Definition Classes
    RDD
  50. def groupBy[K](f: ((VertexId, VD)) ⇒ K, p: Partitioner)(implicit arg0: ClassTag[K]): RDD[(K, Seq[(VertexId, VD)])]

    Definition Classes
    RDD
  51. def groupBy[K](f: ((VertexId, VD)) ⇒ K, numPartitions: Int)(implicit arg0: ClassTag[K]): RDD[(K, Seq[(VertexId, VD)])]

    Definition Classes
    RDD
  52. def groupBy[K](f: ((VertexId, VD)) ⇒ K)(implicit arg0: ClassTag[K]): RDD[(K, Seq[(VertexId, VD)])]

    Definition Classes
    RDD
  53. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  54. val id: Int

    Definition Classes
    RDD
  55. def innerJoin[U, VD2](other: RDD[(VertexId, U)])(f: (VertexId, VD, U) ⇒ VD2)(implicit arg0: ClassTag[U], arg1: ClassTag[VD2]): VertexRDD[VD2]

    Inner joins this VertexRDD with an RDD containing vertex attribute pairs.

    Inner joins this VertexRDD with an RDD containing vertex attribute pairs. If the other RDD is backed by a VertexRDD with the same index then the efficient innerZipJoin implementation is used.

    other

    an RDD containing vertices to join. If there are multiple entries for the same vertex, one is picked arbitrarily. Use aggregateUsingIndex to merge multiple entries.

    f

    the join function applied to corresponding values of this and other

    returns

    a VertexRDD co-indexed with this, containing only vertices that appear in both this and other, with values supplied by f

  56. def innerZipJoin[U, VD2](other: VertexRDD[U])(f: (VertexId, VD, U) ⇒ VD2)(implicit arg0: ClassTag[U], arg1: ClassTag[VD2]): VertexRDD[VD2]

    Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.

    Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index. See innerJoin for the behavior of the join.

  57. def isCheckpointed: Boolean

    Definition Classes
    RDD
  58. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  59. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  60. final def iterator(split: Partition, context: TaskContext): Iterator[(VertexId, VD)]

    Definition Classes
    RDD
  61. def keyBy[K](f: ((VertexId, VD)) ⇒ K): RDD[(K, (VertexId, VD))]

    Definition Classes
    RDD
  62. def leftJoin[VD2, VD3](other: RDD[(VertexId, VD2)])(f: (VertexId, VD, Option[VD2]) ⇒ VD3)(implicit arg0: ClassTag[VD2], arg1: ClassTag[VD3]): VertexRDD[VD3]

    Left joins this VertexRDD with an RDD containing vertex attribute pairs.

    Left joins this VertexRDD with an RDD containing vertex attribute pairs. If the other RDD is backed by a VertexRDD with the same index then the efficient leftZipJoin implementation is used. The resulting VertexRDD contains an entry for each vertex in this. If other is missing any vertex in this VertexRDD, f is passed None. If there are duplicates, the vertex is picked arbitrarily.

    VD2

    the attribute type of the other VertexRDD

    VD3

    the attribute type of the resulting VertexRDD

    other

    the other VertexRDD with which to join

    f

    the function mapping a vertex id and its attributes in this and the other vertex set to a new vertex attribute.

    returns

    a VertexRDD containing all the vertices in this VertexRDD with the attributes emitted by f.

  63. def leftZipJoin[VD2, VD3](other: VertexRDD[VD2])(f: (VertexId, VD, Option[VD2]) ⇒ VD3)(implicit arg0: ClassTag[VD2], arg1: ClassTag[VD3]): VertexRDD[VD3]

    Left joins this RDD with another VertexRDD with the same index.

    Left joins this RDD with another VertexRDD with the same index. This function will fail if both VertexRDDs do not share the same index. The resulting vertex set contains an entry for each vertex in this. If other is missing any vertex in this VertexRDD, f is passed None.

    VD2

    the attribute type of the other VertexRDD

    VD3

    the attribute type of the resulting VertexRDD

    other

    the other VertexRDD with which to join.

    f

    the function mapping a vertex id and its attributes in this and the other vertex set to a new vertex attribute.

    returns

    a VertexRDD containing the results of f

  64. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  65. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  66. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  67. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  68. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  69. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  70. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  71. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  72. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  73. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  74. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  75. def map[U](f: ((VertexId, VD)) ⇒ U)(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  76. def mapPartitions[U](f: (Iterator[(VertexId, VD)]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  77. def mapPartitionsWithContext[U](f: (TaskContext, Iterator[(VertexId, VD)]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  78. def mapPartitionsWithIndex[U](f: (Int, Iterator[(VertexId, VD)]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  79. def mapValues[VD2](f: (VertexId, VD) ⇒ VD2)(implicit arg0: ClassTag[VD2]): VertexRDD[VD2]

    Maps each vertex attribute, additionally supplying the vertex ID.

    Maps each vertex attribute, additionally supplying the vertex ID.

    VD2

    the type returned by the map function

    f

    the function applied to each ID-value pair in the RDD

    returns

    a new VertexRDD with values obtained by applying f to each of the entries in the original VertexRDD. The resulting VertexRDD retains the same index.

  80. def mapValues[VD2](f: (VD) ⇒ VD2)(implicit arg0: ClassTag[VD2]): VertexRDD[VD2]

    Maps each vertex attribute, preserving the index.

    Maps each vertex attribute, preserving the index.

    VD2

    the type returned by the map function

    f

    the function applied to each value in the RDD

    returns

    a new VertexRDD with values obtained by applying f to each of the entries in the original VertexRDD

  81. def mapWith[A, U](constructA: (Int) ⇒ A, preservesPartitioning: Boolean)(f: ((VertexId, VD), A) ⇒ U)(implicit arg0: ClassTag[A], arg1: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
  82. var name: String

    Definition Classes
    RDD
  83. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  84. final def notify(): Unit

    Definition Classes
    AnyRef
  85. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  86. val partitioner: Option[Partitioner]

    Definition Classes
    VertexRDD → RDD
  87. final def partitions: Array[Partition]

    Definition Classes
    RDD
  88. val partitionsRDD: RDD[VertexPartition[VD]]

  89. def persist(): VertexRDD[VD]

    Persist this RDD with the default storage level (MEMORY_ONLY).

    Persist this RDD with the default storage level (MEMORY_ONLY).

    Definition Classes
    VertexRDD → RDD
  90. def persist(newLevel: StorageLevel): VertexRDD[VD]

    Definition Classes
    VertexRDD → RDD
  91. def pipe(command: Seq[String], env: Map[String, String], printPipeContext: ((String) ⇒ Unit) ⇒ Unit, printRDDElement: ((VertexId, VD), (String) ⇒ Unit) ⇒ Unit): RDD[String]

    Definition Classes
    RDD
  92. def pipe(command: String, env: Map[String, String]): RDD[String]

    Definition Classes
    RDD
  93. def pipe(command: String): RDD[String]

    Definition Classes
    RDD
  94. final def preferredLocations(split: Partition): Seq[String]

    Definition Classes
    RDD
  95. def reduce(f: ((VertexId, VD), (VertexId, VD)) ⇒ (VertexId, VD)): (VertexId, VD)

    Definition Classes
    RDD
  96. def reindex(): VertexRDD[VD]

    Construct a new VertexRDD that is indexed by only the visible vertices.

    Construct a new VertexRDD that is indexed by only the visible vertices. The resulting VertexRDD will be based on a different index and can no longer be quickly joined with this RDD.

  97. def repartition(numPartitions: Int): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  98. def sample(withReplacement: Boolean, fraction: Double, seed: Int): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  99. def saveAsObjectFile(path: String): Unit

    Definition Classes
    RDD
  100. def saveAsTextFile(path: String, codec: Class[_ <: CompressionCodec]): Unit

    Definition Classes
    RDD
  101. def saveAsTextFile(path: String): Unit

    Definition Classes
    RDD
  102. def setGenerator(_generator: String): Unit

    Definition Classes
    RDD
  103. def setName(_name: String): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  104. def sparkContext: SparkContext

    Definition Classes
    RDD
  105. def subtract(other: RDD[(VertexId, VD)], p: Partitioner): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  106. def subtract(other: RDD[(VertexId, VD)], numPartitions: Int): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  107. def subtract(other: RDD[(VertexId, VD)]): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  108. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  109. def take(num: Int): Array[(VertexId, VD)]

    Definition Classes
    RDD
  110. def takeOrdered(num: Int)(implicit ord: Ordering[(VertexId, VD)]): Array[(VertexId, VD)]

    Definition Classes
    RDD
  111. def takeSample(withReplacement: Boolean, num: Int, seed: Int): Array[(VertexId, VD)]

    Definition Classes
    RDD
  112. def toArray(): Array[(VertexId, VD)]

    Definition Classes
    RDD
  113. def toDebugString: String

    Definition Classes
    RDD
  114. def toJavaRDD(): JavaRDD[(VertexId, VD)]

    Definition Classes
    RDD
  115. def toString(): String

    Definition Classes
    RDD → AnyRef → Any
  116. def top(num: Int)(implicit ord: Ordering[(VertexId, VD)]): Array[(VertexId, VD)]

    Definition Classes
    RDD
  117. def union(other: RDD[(VertexId, VD)]): RDD[(VertexId, VD)]

    Definition Classes
    RDD
  118. def unpersist(blocking: Boolean = true): VertexRDD[VD]

    Definition Classes
    VertexRDD → RDD
  119. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  120. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  121. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  122. def zip[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[((VertexId, VD), U)]

    Definition Classes
    RDD
  123. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D])(f: (Iterator[(VertexId, VD)], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]

    Definition Classes
    RDD
  124. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D], preservesPartitioning: Boolean)(f: (Iterator[(VertexId, VD)], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]

    Definition Classes
    RDD
  125. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C])(f: (Iterator[(VertexId, VD)], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]

    Definition Classes
    RDD
  126. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C], preservesPartitioning: Boolean)(f: (Iterator[(VertexId, VD)], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]

    Definition Classes
    RDD
  127. def zipPartitions[B, V](rdd2: RDD[B])(f: (Iterator[(VertexId, VD)], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]

    Definition Classes
    RDD
  128. def zipPartitions[B, V](rdd2: RDD[B], preservesPartitioning: Boolean)(f: (Iterator[(VertexId, VD)], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]

    Definition Classes
    RDD

Deprecated Value Members

  1. def mapPartitionsWithSplit[U](f: (Int, Iterator[(VertexId, VD)]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]

    Definition Classes
    RDD
    Annotations
    @deprecated
    Deprecated

    (Since version 0.7.0) use mapPartitionsWithIndex

Inherited from RDD[(VertexId, VD)]

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped