pyspark.sql.functions.kll_sketch_get_n_bigint#

pyspark.sql.functions.kll_sketch_get_n_bigint(col)[source]#

Returns the number of items collected in the KLL bigint sketch.

New in version 4.1.0.

Parameters
colColumn or column name

The KLL bigint sketch binary representation

Returns
Column

The count of items in the sketch.

Examples

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([1,2,3,4,5], "INT")
>>> sketch_df = df.agg(sf.kll_sketch_agg_bigint("value").alias("sketch"))
>>> sketch_df.select(sf.kll_sketch_get_n_bigint("sketch")).show()
+-------------------------------+
|kll_sketch_get_n_bigint(sketch)|
+-------------------------------+
|                              5|
+-------------------------------+