org.apache.spark.graphx.PartitionStrategy

EdgePartition2D

object EdgePartition2D extends PartitionStrategy with Product with Serializable

Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.

Suppose we have a graph with 11 vertices that we want to partition over 9 machines. We can use the following sparse matrix representation:

      __________________________________
 v0   | P0 *     | P1       | P2    *  |
 v1   |  ****    |  *       |          |
 v2   |  ******* |      **  |  ****    |
 v3   |  *****   |  *  *    |       *  |
      ----------------------------------
 v4   | P3 *     | P4 ***   | P5 **  * |
 v5   |  *  *    |  *       |          |
 v6   |       *  |      **  |  ****    |
 v7   |  * * *   |  *  *    |       *  |
      ----------------------------------
 v8   | P6   *   | P7    *  | P8  *   *|
 v9   |     *    |  *    *  |          |
 v10  |       *  |      **  |  *  *    |
 v11  | * <-E    |  ***     |       ** |
      ----------------------------------

The edge denoted by E connects v11 with v1 and is assigned to processor P6. To get the processor number we divide the matrix into sqrt(numParts) by sqrt(numParts) blocks. Notice that edges adjacent to v11 can only be in the first column of blocks (P0, P3, P6) or the last row of blocks (P6, P7, P8). As a consequence we can guarantee that v11 will need to be replicated to at most 2 * sqrt(numParts) machines.

Notice that P0 has many edges and as a consequence this partitioning would lead to poor work balance. To improve balance we first multiply each vertex id by a large prime to shuffle the vertex locations.

One of the limitations of this approach is that the number of machines must either be a perfect square. We partially address this limitation by computing the machine assignment to the next largest perfect square and then mapping back down to the actual number of machines. Unfortunately, this can also lead to work imbalance and so it is suggested that a perfect square is used.

Linear Supertypes
Product, Equals, PartitionStrategy, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. EdgePartition2D
  2. Product
  3. Equals
  4. PartitionStrategy
  5. Serializable
  6. Serializable
  7. AnyRef
  8. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def getPartition(src: VertexId, dst: VertexId, numParts: PartitionID): PartitionID

    Returns the partition number for a given edge.

    Returns the partition number for a given edge.

    Definition Classes
    EdgePartition2DPartitionStrategy
  13. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Definition Classes
    AnyRef
  16. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  17. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  18. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from PartitionStrategy

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped