pyspark.pandas.Index.sort_values

Index.sort_values(return_indexer: bool = False, ascending: bool = True) → Union[pyspark.pandas.indexes.base.Index, Tuple[pyspark.pandas.indexes.base.Index, pyspark.pandas.indexes.base.Index]][source]

Return a sorted copy of the index, and optionally return the indices that sorted the index itself.

Note

This method is not supported for pandas when index has NaN value. pandas raises unexpected TypeError, but we support treating NaN as the smallest value. This method returns indexer as a pandas-on-Spark index while pandas returns it as a list. That’s because indexer in pandas-on-Spark may not fit in memory.

Parameters
return_indexerbool, default False

Should the indices that would sort the index be returned.

ascendingbool, default True

Should the index values be sorted in an ascending order.

Returns
sorted_indexps.Index or ps.MultiIndex

Sorted copy of the index.

indexerps.Index

The indices that the index itself was sorted by.

See also

Series.sort_values

Sort values of a Series.

DataFrame.sort_values

Sort values in a DataFrame.

Examples

>>> idx = ps.Index([10, 100, 1, 1000])
>>> idx  
Int64Index([10, 100, 1, 1000], dtype='int64')

Sort values in ascending order (default behavior).

>>> idx.sort_values()  
Int64Index([1, 10, 100, 1000], dtype='int64')

Sort values in descending order.

>>> idx.sort_values(ascending=False)  
Int64Index([1000, 100, 10, 1], dtype='int64')

Sort values in descending order, and also get the indices idx was sorted by.

>>> idx.sort_values(ascending=False, return_indexer=True)  
(Int64Index([1000, 100, 10, 1], dtype='int64'), Int64Index([3, 1, 0, 2], dtype='int64'))

Support for MultiIndex.

>>> psidx = ps.MultiIndex.from_tuples([('a', 'x', 1), ('c', 'y', 2), ('b', 'z', 3)])
>>> psidx  
MultiIndex([('a', 'x', 1),
            ('c', 'y', 2),
            ('b', 'z', 3)],
           )
>>> psidx.sort_values()  
MultiIndex([('a', 'x', 1),
            ('b', 'z', 3),
            ('c', 'y', 2)],
           )
>>> psidx.sort_values(ascending=False)  
MultiIndex([('c', 'y', 2),
            ('b', 'z', 3),
            ('a', 'x', 1)],
           )
>>> psidx.sort_values(ascending=False, return_indexer=True)  
(MultiIndex([('c', 'y', 2),
            ('b', 'z', 3),
            ('a', 'x', 1)],
           ), Int64Index([1, 2, 0], dtype='int64'))