pyspark.sql.functions.last_value

pyspark.sql.functions.last_value(col: ColumnOrName, ignoreNulls: Union[bool, pyspark.sql.column.Column, None] = None) → pyspark.sql.column.Column[source]

Returns the last value of col for a group of rows. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.

New in version 3.5.0.

Parameters
colColumn or str

target column to work on.

ignorenullsColumn or bool

if first value is null then look for first non-null value.

Returns
Column

some value of col for a group of rows.

Examples

>>> import pyspark.sql.functions as sf
>>> spark.createDataFrame(
...     [("a", 1), ("a", 2), ("a", 3), ("b", 8), (None, 2)], ["a", "b"]
... ).select(sf.last_value('a'), sf.last_value('b')).show()
+-------------+-------------+
|last_value(a)|last_value(b)|
+-------------+-------------+
|         NULL|            2|
+-------------+-------------+
>>> import pyspark.sql.functions as sf
>>> spark.createDataFrame(
...     [("a", 1), ("a", 2), ("a", 3), ("b", 8), (None, 2)], ["a", "b"]
... ).select(sf.last_value('a', True), sf.last_value('b', True)).show()
+-------------+-------------+
|last_value(a)|last_value(b)|
+-------------+-------------+
|            b|            2|
+-------------+-------------+