pyspark.sql.functions.arrays_zip

pyspark.sql.functions.arrays_zip(*cols: ColumnOrName) → pyspark.sql.column.Column[source]

Collection function: Returns a merged array of structs in which the N-th struct contains all N-th values of input arrays.

New in version 2.4.0.

Parameters
colsColumn or str

columns of arrays to be merged.

Examples

>>> from pyspark.sql.functions import arrays_zip
>>> df = spark.createDataFrame([(([1, 2, 3], [2, 3, 4]))], ['vals1', 'vals2'])
>>> df.select(arrays_zip(df.vals1, df.vals2).alias('zipped')).collect()
[Row(zipped=[Row(vals1=1, vals2=2), Row(vals1=2, vals2=3), Row(vals1=3, vals2=4)])]