pyspark.sql.functions.array_intersect

pyspark.sql.functions.array_intersect(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column[source]

Collection function: returns an array of the elements in the intersection of col1 and col2, without duplicates.

New in version 2.4.0.

Parameters
col1Column or str

name of column containing array

col2Column or str

name of column containing array

Examples

>>> from pyspark.sql import Row
>>> df = spark.createDataFrame([Row(c1=["b", "a", "c"], c2=["c", "d", "a", "f"])])
>>> df.select(array_intersect(df.c1, df.c2)).collect()
[Row(array_intersect(c1, c2)=['a', 'c'])]