Skip to contents

Groups the SparkDataFrame using the specified columns, so we can run aggregation on them.

Usage

group_by(x, ...)

groupBy(x, ...)

# S4 method for SparkDataFrame
groupBy(x, ...)

# S4 method for SparkDataFrame
group_by(x, ...)

Arguments

x

a SparkDataFrame.

...

character name(s) or Column(s) to group on.

Value

A GroupedData.

Note

groupBy since 1.4.0

group_by since 1.4.0

Examples

if (FALSE) {
  # Compute the average for all numeric columns grouped by department.
  avg(groupBy(df, "department"))

  # Compute the max age and average salary, grouped by department and gender.
  agg(groupBy(df, "department", "gender"), salary="avg", "age" -> "max")
}