pyspark.sql.functions.array_compact#
- pyspark.sql.functions.array_compact(col)[source]#
- Array function: removes null values from the array. - New in version 3.4.0. - Parameters
- colColumnor str
- name of column or expression 
 
- col
- Returns
- Column
- A new column that is an array excluding the null values from the input column. 
 
 - Notes - Supports Spark Connect. - Examples - Example 1: Removing null values from a simple array - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, None, 2, 3],)], ['data']) >>> df.select(sf.array_compact(df.data)).show() +-------------------+ |array_compact(data)| +-------------------+ | [1, 2, 3]| +-------------------+ - Example 2: Removing null values from multiple arrays - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, None, 2, 3],), ([4, 5, None, 4],)], ['data']) >>> df.select(sf.array_compact(df.data)).show() +-------------------+ |array_compact(data)| +-------------------+ | [1, 2, 3]| | [4, 5, 4]| +-------------------+ - Example 3: Removing null values from an array with all null values - >>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType >>> schema = StructType([ ... StructField("data", ArrayType(StringType()), True) ... ]) >>> df = spark.createDataFrame([([None, None, None],)], schema) >>> df.select(sf.array_compact(df.data)).show() +-------------------+ |array_compact(data)| +-------------------+ | []| +-------------------+ - Example 4: Removing null values from an array with no null values - >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3],)], ['data']) >>> df.select(sf.array_compact(df.data)).show() +-------------------+ |array_compact(data)| +-------------------+ | [1, 2, 3]| +-------------------+ - Example 5: Removing null values from an empty array - >>> from pyspark.sql import functions as sf >>> from pyspark.sql.types import ArrayType, StringType, StructField, StructType >>> schema = StructType([ ... StructField("data", ArrayType(StringType()), True) ... ]) >>> df = spark.createDataFrame([([],)], schema) >>> df.select(sf.array_compact(df.data)).show() +-------------------+ |array_compact(data)| +-------------------+ | []| +-------------------+