pyspark.sql.DataFrame.head#

DataFrame.head(n=None)[source]#

Returns the first n rows.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
nint, optional

default 1. Number of rows to return.

Returns
If n is supplied, return a list of Row of length n
or less if the DataFrame has fewer elements.
If n is missing, return a single Row.

Notes

This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory.

Examples

>>> df = spark.createDataFrame([
...     (2, "Alice"), (5, "Bob")], schema=["age", "name"])
>>> df.head()
Row(age=2, name='Alice')
>>> df.head(1)
[Row(age=2, name='Alice')]
>>> df.head(0)
[]