pyspark.sql.functions.stack

pyspark.sql.functions.stack(*cols: ColumnOrName) → pyspark.sql.column.Column[source]

Separates col1, …, colk into n rows. Uses column names col0, col1, etc. by default unless specified otherwise.

New in version 3.5.0.

Parameters
colsColumn or str

the first element should be a literal int for the number of rows to be separated, and the remaining are input elements to be separated.

Examples

>>> df = spark.createDataFrame([(1, 2, 3)], ["a", "b", "c"])
>>> df.select(stack(lit(2), df.a, df.b, df.c)).show(truncate=False)
+----+----+
|col0|col1|
+----+----+
|1   |2   |
|3   |NULL|
+----+----+