pyspark.sql.DataFrame.collect

DataFrame.collect() → List[pyspark.sql.types.Row][source]

Returns all the records as a list of Row.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Returns
list

List of rows.

Examples

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
>>> df.collect()
[Row(age=14, name='Tom'), Row(age=23, name='Alice'), Row(age=16, name='Bob')]