pyspark.sql.functions.sort_array#
- pyspark.sql.functions.sort_array(col, asc=True)[source]#
Array function: Sorts the input array in ascending or descending order according to the natural ordering of the array elements. Null elements will be placed at the beginning of the returned array in ascending order or at the end of the returned array in descending order.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- col
Column
or str Name of the column or expression.
- ascbool, optional
Whether to sort in ascending or descending order. If asc is True (default), then the sorting is in ascending order. If False, then in descending order.
- col
- Returns
Column
Sorted array.
Examples
Example 1: Sorting an array in ascending order
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([([2, 1, None, 3],)], ['data']) >>> df.select(sf.sort_array(df.data)).show() +----------------------+ |sort_array(data, true)| +----------------------+ | [NULL, 1, 2, 3]| +----------------------+
Example 2: Sorting an array in descending order
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([([2, 1, None, 3],)], ['data']) >>> df.select(sf.sort_array(df.data, asc=False)).show() +-----------------------+ |sort_array(data, false)| +-----------------------+ | [3, 2, 1, NULL]| +-----------------------+
Example 3: Sorting an array with a single element
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([([1],)], ['data']) >>> df.select(sf.sort_array(df.data)).show() +----------------------+ |sort_array(data, true)| +----------------------+ | [1]| +----------------------+
Example 4: Sorting an empty array
>>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType >>> schema = StructType([StructField("data", ArrayType(StringType()), True)]) >>> df = spark.createDataFrame([([],)], schema=schema) >>> df.select(sf.sort_array(df.data)).show() +----------------------+ |sort_array(data, true)| +----------------------+ | []| +----------------------+
Example 5: Sorting an array with null values
>>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, IntegerType, StructType, StructField >>> schema = StructType([StructField("data", ArrayType(IntegerType()), True)]) >>> df = spark.createDataFrame([([None, None, None],)], schema=schema) >>> df.select(sf.sort_array(df.data)).show() +----------------------+ |sort_array(data, true)| +----------------------+ | [NULL, NULL, NULL]| +----------------------+