Package org.apache.spark.storage
Class BasicBlockReplicationPolicy
Object
org.apache.spark.storage.BasicBlockReplicationPolicy
- All Implemented Interfaces:
org.apache.spark.internal.Logging
,BlockReplicationPolicy
public class BasicBlockReplicationPolicy
extends Object
implements BlockReplicationPolicy, org.apache.spark.internal.Logging
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionscala.collection.immutable.List<BlockManagerId>
prioritize
(BlockManagerId blockManagerId, scala.collection.Seq<BlockManagerId> peers, scala.collection.mutable.HashSet<BlockManagerId> peersReplicatedTo, BlockId blockId, int numReplicas) Method to prioritize a bunch of candidate peers of a block manager.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
-
Constructor Details
-
BasicBlockReplicationPolicy
public BasicBlockReplicationPolicy()
-
-
Method Details
-
prioritize
public scala.collection.immutable.List<BlockManagerId> prioritize(BlockManagerId blockManagerId, scala.collection.Seq<BlockManagerId> peers, scala.collection.mutable.HashSet<BlockManagerId> peersReplicatedTo, BlockId blockId, int numReplicas) Method to prioritize a bunch of candidate peers of a block manager. This implementation replicates the behavior of block replication in HDFS. For a given number of replicas needed, we choose a peer within the rack, one outside and remaining blockmanagers are chosen at random, in that order till we meet the number of replicas needed. This works best with a total replication factor of 3, like HDFS.- Specified by:
prioritize
in interfaceBlockReplicationPolicy
- Parameters:
blockManagerId
- Id of the current BlockManager for self identificationpeers
- A list of peers of a BlockManagerpeersReplicatedTo
- Set of peers already replicated toblockId
- BlockId of the block being replicated. This can be used as a source of randomness if needed.numReplicas
- Number of peers we need to replicate to- Returns:
- A prioritized list of peers. Lower the index of a peer, higher its priority
-