pyspark.sql.functions.any_value

pyspark.sql.functions.any_value(col: ColumnOrName, ignoreNulls: Union[bool, pyspark.sql.column.Column, None] = None) → pyspark.sql.column.Column[source]

Returns some value of col for a group of rows.

New in version 3.5.0.

Parameters
colColumn or str

target column to work on.

ignorenullsColumn or bool

if first value is null then look for first non-null value.

Returns
Column

some value of col for a group of rows.

Examples

>>> df = spark.createDataFrame([(None, 1),
...                             ("a", 2),
...                             ("a", 3),
...                             ("b", 8),
...                             ("b", 2)], ["c1", "c2"])
>>> df.select(any_value('c1'), any_value('c2')).collect()
[Row(any_value(c1)=None, any_value(c2)=1)]
>>> df.select(any_value('c1', True), any_value('c2', True)).collect()
[Row(any_value(c1)='a', any_value(c2)=1)]