Aggregate the elements of each partition, and then the results for all
the partitions, using a given associative function and a neutral “zero value.”
The function op(t1, t2) is allowed to modify t1 and return it
as its result value to avoid object allocation; however, it should not
This behaves somewhat differently from fold operations implemented
for non-distributed collections in functional languages like Scala.
This fold operation may be applied to partitions individually, and then
fold those results into the final result, rather than apply the fold
to each element sequentially in some defined ordering. For functions
that are not commutative, the result may differ from that of a fold
applied to a non-distributed collection.
>>> from operator import add
>>> sc.parallelize([1, 2, 3, 4, 5]).fold(0, add)