pyspark.pandas.groupby.GroupBy.rank¶

GroupBy.rank(method: str = 'average', ascending: bool = True) → FrameLike[source]¶

Provide the rank of values within each group.

Parameters

method{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’

average: average rank of group
min: lowest rank in group
max: highest rank in group
first: ranks assigned in order they appear in the array
dense: like ‘min’, but rank always increases by 1 between groups

ascendingboolean, default True

False for ranks by high (1) to low (N)

Returns

DataFrame with ranking of values within each group

Examples

>>> df = ps.DataFrame({
...     'a': [1, 1, 1, 2, 2, 2, 3, 3, 3],
...     'b': [1, 2, 2, 2, 3, 3, 3, 4, 4]}, columns=['a', 'b'])
>>> df
   a  b
0  1  1
1  1  2
2  1  2
3  2  2
4  2  3
5  2  3
6  3  3
7  3  4
8  3  4

>>> df.groupby("a").rank().sort_index()
     b
1.0
2.5
2.5
1.0
2.5
2.5
1.0
2.5
2.5

>>> df.b.groupby(df.a).rank(method='max').sort_index()
  1.0
  3.0
  3.0
  1.0
  3.0
  3.0
  1.0
  3.0
  3.0
Name: b, dtype: float64

pyspark.pandas.groupby.GroupBy.prod

pyspark.pandas.groupby.GroupBy.sem