pyspark.sql.functions.array_sort¶
-
pyspark.sql.functions.
array_sort
(col: ColumnOrName, comparator: Optional[Callable[[pyspark.sql.column.Column, pyspark.sql.column.Column], pyspark.sql.column.Column]] = None) → pyspark.sql.column.Column[source]¶ Collection function: sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array.
New in version 2.4.0.
Changed in version 3.4.0: Can take a comparator function.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- col
Column
or str name of column or expression
- comparatorcallable, optional
A binary
(Column, Column) -> Column: ...
. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error.
- col
- Returns
Column
sorted array.
Examples
>>> df = spark.createDataFrame([([2, 1, None, 3],),([1],),([],)], ['data']) >>> df.select(array_sort(df.data).alias('r')).collect() [Row(r=[1, 2, 3, None]), Row(r=[1]), Row(r=[])] >>> df = spark.createDataFrame([(["foo", "foobar", None, "bar"],),(["foo"],),([],)], ['data']) >>> df.select(array_sort( ... "data", ... lambda x, y: when(x.isNull() | y.isNull(), lit(0)).otherwise(length(y) - length(x)) ... ).alias("r")).collect() [Row(r=['foobar', 'foo', None, 'bar']), Row(r=['foo']), Row(r=[])]