pyspark.sql.functions.map_from_entries#

pyspark.sql.functions.map_from_entries(col)[source]#

Map function: Transforms an array of key-value pair entries (structs with two fields) into a map. The first field of each entry is used as the key and the second field as the value in the resulting map column

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

Name of column or expression

Returns
Column

A map created from the given array of entries.

Examples

Example 1: Basic usage of map_from_entries

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT array(struct(1, 'a'), struct(2, 'b')) as data")
>>> df.select(sf.map_from_entries(df.data)).show()
+----------------------+
|map_from_entries(data)|
+----------------------+
|      {1 -> a, 2 -> b}|
+----------------------+

Example 2: map_from_entries with null values

>>> from pyspark.sql import functions as sf
>>> df = spark.sql("SELECT array(struct(1, null), struct(2, 'b')) as data")
>>> df.select(sf.map_from_entries(df.data)).show()
+----------------------+
|map_from_entries(data)|
+----------------------+
|   {1 -> NULL, 2 -> b}|
+----------------------+

Example 3: map_from_entries with a DataFrame

>>> from pyspark.sql import Row, functions as sf
>>> df = spark.createDataFrame([([Row(1, "a"), Row(2, "b")],), ([Row(3, "c")],)], ['data'])
>>> df.select(sf.map_from_entries(df.data)).show()
+----------------------+
|map_from_entries(data)|
+----------------------+
|      {1 -> a, 2 -> b}|
|              {3 -> c}|
+----------------------+

Example 4: map_from_entries with empty array

>>> from pyspark.sql import functions as sf
>>> from pyspark.sql.types import ArrayType, StringType, IntegerType, StructType, StructField
>>> schema = StructType([
...   StructField("data", ArrayType(
...     StructType([
...       StructField("key", IntegerType()),
...       StructField("value", StringType())
...     ])
...   ), True)
... ])
>>> df = spark.createDataFrame([([],)], schema=schema)
>>> df.select(sf.map_from_entries(df.data)).show()
+----------------------+
|map_from_entries(data)|
+----------------------+
|                    {}|
+----------------------+