pyspark.pandas.groupby.GroupBy.idxmax#

GroupBy.idxmax(skipna=True)[source]#

Return index of first occurrence of maximum over requested axis in group. NA/null values are excluded.

Parameters

skipnaboolean, default True: Exclude NA/null values. If an entire row/column is NA, the result will be NA.

See also

Series.idxmax
DataFrame.idxmax
pyspark.pandas.Series.groupby
pyspark.pandas.DataFrame.groupby

Examples

>>> df = ps.DataFrame({'a': [1, 1, 2, 2, 3],
...                    'b': [1, 2, 3, 4, 5],
...                    'c': [5, 4, 3, 2, 1]}, columns=['a', 'b', 'c'])

>>> df.groupby(['a'])['b'].idxmax().sort_index() 
a
1  1
2  3
3  4
Name: b, dtype: int64

>>> df.groupby(['a']).idxmax().sort_index() 
   b  c
a
1  1  0
2  3  2
3  4  4