group_by {SparkR}R Documentation

GroupBy

Description

Groups the SparkDataFrame using the specified columns, so we can run aggregation on them.

Usage

group_by(x, ...)

groupBy(x, ...)

## S4 method for signature 'SparkDataFrame'
groupBy(x, ...)

## S4 method for signature 'SparkDataFrame'
group_by(x, ...)

Arguments

x

a SparkDataFrame

Value

a GroupedData

See Also

GroupedData

Other SparkDataFrame functions: SparkDataFrame-class, [[, agg, arrange, as.data.frame, attach, cache, collect, colnames, coltypes, columns, count, dapply, describe, dim, distinct, dropDuplicates, dropna, drop, dtypes, except, explain, filter, first, head, histogram, insertInto, intersect, isLocal, join, limit, merge, mutate, ncol, persist, printSchema, registerTempTable, rename, repartition, sample, saveAsTable, selectExpr, select, showDF, show, str, take, unionAll, unpersist, withColumn, write.df, write.jdbc, write.json, write.parquet, write.text

Examples

## Not run: 
##D   # Compute the average for all numeric columns grouped by department.
##D   avg(groupBy(df, "department"))
##D 
##D   # Compute the max age and average salary, grouped by department and gender.
##D   agg(groupBy(df, "department", "gender"), salary="avg", "age" -> "max")
## End(Not run)

[Package SparkR version 2.0.0 Index]