Aggregate function: returns a set of objects with duplicate elements eliminated.
New in version 1.6.0.
The function is non-deterministic because the order of collected results depends
on the order of the rows which may be non-deterministic after a shuffle.
>>> df2 = spark.createDataFrame([(2,), (5,), (5,)], ('age',))