pyspark.sql.functions.shuffle

pyspark.sql.functions.shuffle(col: ColumnOrName) → pyspark.sql.column.Column[source]

Collection function: Generates a random permutation of the given array.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

name of column or expression

Returns
Column

an array of elements in random order.

Notes

The function is non-deterministic.

Examples

>>> df = spark.createDataFrame([([1, 20, 3, 5],), ([1, 20, None, 3],)], ['data'])
>>> df.select(shuffle(df.data).alias('s')).collect()  
[Row(s=[3, 1, 5, 20]), Row(s=[20, None, 3, 1])]