pyspark.sql.functions.min_by

pyspark.sql.functions.min_by(col: ColumnOrName, ord: ColumnOrName) → pyspark.sql.column.Column[source]

Returns the value associated with the minimum value of ord.

New in version 3.3.0.

Parameters
colColumn or str

target column that the value will be returned

ordColumn or str

column to be minimized

Returns
Column

value associated with the minimum value of ord.

Examples

>>> df = spark.createDataFrame([
...     ("Java", 2012, 20000), ("dotNET", 2012, 5000),
...     ("dotNET", 2013, 48000), ("Java", 2013, 30000)],
...     schema=("course", "year", "earnings"))
>>> df.groupby("course").agg(min_by("year", "earnings")).show()
+------+----------------------+
|course|min_by(year, earnings)|
+------+----------------------+
|  Java|                  2012|
|dotNET|                  2012|
+------+----------------------+