Package pyspark :: Module storagelevel :: Class StorageLevel
[frames] | no frames]

Class StorageLevel

source code

Flags for controlling the storage of an RDD. Each StorageLevel records whether to use memory, whether to drop the RDD to disk if it falls out of memory, whether to keep the data in memory in a serialized format, and whether to replicate the RDD partitions on multiple nodes. Also contains static constants for some commonly used storage levels, such as MEMORY_ONLY.

Instance Methods
__init__(self, useDisk, useMemory, useOffHeap, deserialized, replication=1) source code
__repr__(self) source code
Class Variables
  DISK_ONLY = StorageLevel(True, False, False, False)
  DISK_ONLY_2 = StorageLevel(True, False, False, False, 2)
  MEMORY_ONLY = StorageLevel(False, True, False, True)
  MEMORY_ONLY_2 = StorageLevel(False, True, False, True, 2)
  MEMORY_ONLY_SER = StorageLevel(False, True, False, False)
  MEMORY_ONLY_SER_2 = StorageLevel(False, True, False, False, 2)
  MEMORY_AND_DISK = StorageLevel(True, True, False, True)
  MEMORY_AND_DISK_2 = StorageLevel(True, True, False, True, 2)
  MEMORY_AND_DISK_SER = StorageLevel(True, True, False, False)
  MEMORY_AND_DISK_SER_2 = StorageLevel(True, True, False, False, 2)
  OFF_HEAP = StorageLevel(False, False, True, False, 1)