org.apache.spark.graphx.PartitionStrategy.EdgePartition2D$

All Implemented Interfaces:: Serializable, PartitionStrategy, scala.Equals, scala.Product

Enclosing interface:: PartitionStrategy

public static class PartitionStrategy.EdgePartition2D$ extends Object implements PartitionStrategy, scala.Product, Serializable

Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication.

Suppose we have a graph with 12 vertices that we want to partition over 9 machines. We can use the following sparse matrix representation:

       __________________________________
  v0   | P0 *     | P1       | P2    *  |
  v1   |  ****    |  *       |          |
  v2   |  ******* |      **  |  ****    |
  v3   |  *****   |  *  *    |       *  |
       ----------------------------------
  v4   | P3 *     | P4 ***   | P5 **  * |
  v5   |  *  *    |  *       |          |
  v6   |       *  |      **  |  ****    |
  v7   |  * * *   |  *  *    |       *  |
       ----------------------------------
  v8   | P6   *   | P7    *  | P8  *   *|
  v9   |     *    |  *    *  |          |
  v10  |       *  |      **  |  *  *    |
  v11  | * <-E    |  ***     |       ** |
       ----------------------------------

The edge denoted by E connects v11 with v1 and is assigned to processor P6. To get the processor number we divide the matrix into sqrt(numParts) by sqrt(numParts) blocks. Notice that edges adjacent to v11 can only be in the first column of blocks (P0, P3, P6) or the last row of blocks (P6, P7, P8). As a consequence we can guarantee that v11 will need to be replicated to at most 2 * sqrt(numParts) machines.

Notice that P0 has many edges and as a consequence this partitioning would lead to poor work balance. To improve balance we first multiply each vertex id by a large prime to shuffle the vertex locations.

When the number of partitions requested is not a perfect square we use a slightly different method where the last column can have a different number of rows than the others while still maintaining the same size per block.

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.graphx.PartitionStrategy
PartitionStrategy.CanonicalRandomVertexCut$, PartitionStrategy.EdgePartition1D$, PartitionStrategy.EdgePartition2D$, PartitionStrategy.RandomVertexCut$
Field Summary

Fields

Modifier and Type

Field

Description

static final PartitionStrategy.EdgePartition2D$

MODULE$

Static reference to the singleton instance of this Scala object.
Constructor Summary

Constructors

Constructor

Description

EdgePartition2D$()
Method Summary

Modifier and Type

Method

Description

int

getPartition(long src, long dst, int numParts)

Returns the partition number for a given edge.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface scala.Equals
canEqual, equals

Methods inherited from interface scala.Product
productArity, productElement, productElementName, productElementNames, productIterator, productPrefix

Field Details
- MODULE$
  
  public static final PartitionStrategy.EdgePartition2D$ MODULE$
  
  Static reference to the singleton instance of this Scala object.
Constructor Details
- EdgePartition2D$
  
  public EdgePartition2D$()
Method Details
- getPartition
  
  public int getPartition(long src, long dst, int numParts)
  
  Description copied from interface: PartitionStrategy
  
  Returns the partition number for a given edge.
  
  Specified by:
  
  getPartition in interface PartitionStrategy

Class PartitionStrategy.EdgePartition2D$

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.graphx.PartitionStrategy

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface scala.Equals

Methods inherited from interface scala.Product

Field Details

MODULE$

Constructor Details

EdgePartition2D$

Method Details

getPartition