pyspark.RDD.collect#
- RDD.collect()[source]#
- Return a list that contains all the elements in this RDD. - New in version 0.7.0. - Returns
- list
- a list containing all the elements 
 
 - Notes - This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory. - Examples - >>> sc.range(5).collect() [0, 1, 2, 3, 4] >>> sc.parallelize(["x", "y", "z"]).collect() ['x', 'y', 'z']