pyspark.sql.DataFrame.tail#

DataFrame.tail(num)[source]#

Returns the last num rows as a list of Row.

Running tail requires moving data into the application’s driver process, and doing so with a very large num can crash the driver process with OutOfMemoryError.

New in version 3.0.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
numint

Number of records to return. Will return this number of records or all records if the DataFrame contains less than this number of records.

Returns
list

List of rows

Examples

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
>>> df.tail(2)
[Row(age=23, name='Alice'), Row(age=16, name='Bob')]