pyspark.sql.DataFrame.sortWithinPartitions

DataFrame.sortWithinPartitions(*cols, **kwargs)[source]

Returns a new DataFrame with each partition sorted by the specified column(s).

New in version 1.6.0.

Parameters:
colsstr, list or Column, optional

list of Column or column names to sort by.

Other Parameters:
ascendingbool or list, optional

boolean or list of boolean (default True). Sort ascending vs. descending. Specify list for multiple sort orders. If a list is specified, length of the list must equal length of the cols.

Examples

>>> df.sortWithinPartitions("age", ascending=False).show()
+---+-----+
|age| name|
+---+-----+
|  2|Alice|
|  5|  Bob|
+---+-----+