pyspark.pandas.DataFrame.sort_values#
- DataFrame.sort_values(by, ascending=True, inplace=False, na_position='last', ignore_index=False)[source]#
- Sort by the values along either axis. - Parameters
- bystr or list of str
- ascendingbool or list of bool, default True
- Sort ascending vs. descending. Specify list for multiple sort orders. If this is a list of bools, must match the length of the by. 
- inplacebool, default False
- if True, perform operation in-place 
- na_position{‘first’, ‘last’}, default ‘last’
- first puts NaNs at the beginning, last puts NaNs at the end 
- ignore_indexbool, default False
- If True, the resulting axis will be labeled 0, 1, …, n - 1. 
 
- Returns
- sorted_objDataFrame
 
 - Examples - >>> df = ps.DataFrame({ ... 'col1': ['A', 'B', None, 'D', 'C'], ... 'col2': [2, 9, 8, 7, 4], ... 'col3': [0, 9, 4, 2, 3], ... }, ... columns=['col1', 'col2', 'col3'], ... index=['a', 'b', 'c', 'd', 'e']) >>> df col1 col2 col3 a A 2 0 b B 9 9 c None 8 4 d D 7 2 e C 4 3 - Sort by col1 - >>> df.sort_values(by=['col1']) col1 col2 col3 a A 2 0 b B 9 9 e C 4 3 d D 7 2 c None 8 4 - Ignore index for the resulting axis - >>> df.sort_values(by=['col1'], ignore_index=True) col1 col2 col3 0 A 2 0 1 B 9 9 2 C 4 3 3 D 7 2 4 None 8 4 - Sort Descending - >>> df.sort_values(by='col1', ascending=False) col1 col2 col3 d D 7 2 e C 4 3 b B 9 9 a A 2 0 c None 8 4 - Sort by multiple columns - >>> df = ps.DataFrame({ ... 'col1': ['A', 'A', 'B', None, 'D', 'C'], ... 'col2': [2, 1, 9, 8, 7, 4], ... 'col3': [0, 1, 9, 4, 2, 3], ... }, ... columns=['col1', 'col2', 'col3']) >>> df.sort_values(by=['col1', 'col2']) col1 col2 col3 1 A 1 1 0 A 2 0 2 B 9 9 5 C 4 3 4 D 7 2 3 None 8 4