pyspark.sql.functions.array_join

pyspark.sql.functions.array_join(col: ColumnOrName, delimiter: str, null_replacement: Optional[str] = None) → pyspark.sql.column.Column[source]

Concatenates the elements of column using the delimiter. Null values are replaced with null_replacement if set, otherwise they are ignored.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

target column to work on.

delimiterstr

delimiter used to concatenate elements

null_replacementstr, optional

if set then null values will be replaced by this value

Returns
Column

a column of string type. Concatenated values.

Examples

>>> df = spark.createDataFrame([(["a", "b", "c"],), (["a", None],)], ['data'])
>>> df.select(array_join(df.data, ",").alias("joined")).collect()
[Row(joined='a,b,c'), Row(joined='a')]
>>> df.select(array_join(df.data, ",", "NULL").alias("joined")).collect()
[Row(joined='a,b,c'), Row(joined='a,NULL')]