pyspark.sql.GroupedData.mean

GroupedData.mean(*cols: str) → pyspark.sql.dataframe.DataFrame[source]

Computes average values for each numeric columns for each group.

mean() is an alias for avg().

New in version 1.3.0.

Parameters
colsstr

column names. Non-numeric columns are ignored.

Examples

>>> df.groupBy().mean('age').collect()
[Row(avg(age)=3.5)]
>>> df3.groupBy().mean('age', 'height').collect()
[Row(avg(age)=3.5, avg(height)=82.5)]