collect() → List[T]¶
Return a list that contains all the elements in this RDD.
New in version 0.7.0.
a list containing all the elements
This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory.
>>> sc.range(5).collect() [0, 1, 2, 3, 4] >>> sc.parallelize(["x", "y", "z"]).collect() ['x', 'y', 'z']