pyspark.sql.GroupedData.sum

GroupedData.sum(*cols)[source]

Computes the sum for each numeric columns for each group.

New in version 1.3.0.

Parameters:
colsstr

column names. Non-numeric columns are ignored.

Examples

>>> df.groupBy().sum('age').collect()
[Row(sum(age)=7)]
>>> df3.groupBy().sum('age', 'height').collect()
[Row(sum(age)=7, sum(height)=165)]