pyspark.sql.functions.countDistinct

pyspark.sql.functions.countDistinct(col, *cols)[source]

Returns a new Column for distinct count of col or cols.

New in version 1.3.0.

Examples

>>> df.agg(countDistinct(df.age, df.name).alias('c')).collect()
[Row(c=2)]
>>> df.agg(countDistinct("age", "name").alias('c')).collect()
[Row(c=2)]