pyspark.streaming.StreamingContext.checkpoint#

StreamingContext.checkpoint(directory)[source]#

Sets the context to periodically checkpoint the DStream operations for master fault-tolerance. The graph will be checkpointed every batch interval.

Parameters
directorystr

HDFS-compatible directory where the checkpoint data will be reliably stored