pyspark.sql.functions.kll_sketch_get_n_double#

pyspark.sql.functions.kll_sketch_get_n_double(col)[source]#

Returns the number of items collected in the KLL double sketch.

New in version 4.1.0.

Parameters
colColumn or column name

The KLL double sketch binary representation

Returns
Column

The count of items in the sketch.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([1.0,2.0,3.0,4.0,5.0], "DOUBLE")
>>> sketch_df = df.agg(sf.kll_sketch_agg_double("value").alias("sketch"))
>>> sketch_df.select(sf.kll_sketch_get_n_double("sketch")).show()
+-------------------------------+
|kll_sketch_get_n_double(sketch)|
+-------------------------------+
|                              5|
+-------------------------------+