Packages

object EdgePartition2D extends PartitionStrategy with Product with Serializable

Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.

Suppose we have a graph with 12 vertices that we want to partition over 9 machines. We can use the following sparse matrix representation:

      __________________________________
 v0   | P0 *     | P1       | P2    *  |
 v1   |  ****    |  *       |          |
 v2   |  ******* |      **  |  ****    |
 v3   |  *****   |  *  *    |       *  |
      ----------------------------------
 v4   | P3 *     | P4 ***   | P5 **  * |
 v5   |  *  *    |  *       |          |
 v6   |       *  |      **  |  ****    |
 v7   |  * * *   |  *  *    |       *  |
      ----------------------------------
 v8   | P6   *   | P7    *  | P8  *   *|
 v9   |     *    |  *    *  |          |
 v10  |       *  |      **  |  *  *    |
 v11  | * <-E    |  ***     |       ** |
      ----------------------------------

The edge denoted by E connects v11 with v1 and is assigned to processor P6. To get the processor number we divide the matrix into sqrt(numParts) by sqrt(numParts) blocks. Notice that edges adjacent to v11 can only be in the first column of blocks (P0, P3, P6) or the last row of blocks (P6, P7, P8). As a consequence we can guarantee that v11 will need to be replicated to at most 2 * sqrt(numParts) machines.

Notice that P0 has many edges and as a consequence this partitioning would lead to poor work balance. To improve balance we first multiply each vertex id by a large prime to shuffle the vertex locations.

When the number of partitions requested is not a perfect square we use a slightly different method where the last column can have a different number of rows than the others while still maintaining the same size per block.

Source
PartitionStrategy.scala
Linear Supertypes
Product, Equals, PartitionStrategy, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. EdgePartition2D
  2. Product
  3. Equals
  4. PartitionStrategy
  5. Serializable
  6. Serializable
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def getPartition(src: VertexId, dst: VertexId, numParts: PartitionID): PartitionID

    Returns the partition number for a given edge.

    Returns the partition number for a given edge.

    Definition Classes
    EdgePartition2DPartitionStrategy