pyspark.sql.functions.from_json#
- pyspark.sql.functions.from_json(col, schema, options=None)[source]#
- Parses a column containing a JSON string into a - MapTypewith- StringTypeas keys type,- StructTypeor- ArrayTypewith the specified schema. Returns null, in the case of an unparsable string.- New in version 2.1.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colColumnor str
- a column or column name in JSON format 
- schemaDataTypeor str
- a StructType, ArrayType of StructType or Python string literal with a DDL-formatted string to use when parsing the json column 
- optionsdict, optional
- options to control parsing. accepts the same options as the json datasource. See Data Source Option for the version you use. 
 
- col
- Returns
- Column
- a new column of complex type from given JSON object. 
 
 - Examples - Example 1: Parsing JSON with a specified schema - >>> import pyspark.sql.functions as sf >>> from pyspark.sql.types import StructType, StructField, IntegerType >>> schema = StructType([StructField("a", IntegerType())]) >>> df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value")) >>> df.select(sf.from_json(df.value, schema).alias("json")).show() +----+ |json| +----+ | {1}| +----+ - Example 2: Parsing JSON with a DDL-formatted string. - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value")) >>> df.select(sf.from_json(df.value, "a INT").alias("json")).show() +----+ |json| +----+ | {1}| +----+ - Example 3: Parsing JSON into a MapType - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, '''{"a": 1}''')], ("key", "value")) >>> df.select(sf.from_json(df.value, "MAP<STRING,INT>").alias("json")).show() +--------+ | json| +--------+ |{a -> 1}| +--------+ - Example 4: Parsing JSON into an ArrayType of StructType - >>> import pyspark.sql.functions as sf >>> from pyspark.sql.types import ArrayType, StructType, StructField, IntegerType >>> schema = ArrayType(StructType([StructField("a", IntegerType())])) >>> df = spark.createDataFrame([(1, '''[{"a": 1}]''')], ("key", "value")) >>> df.select(sf.from_json(df.value, schema).alias("json")).show() +-----+ | json| +-----+ |[{1}]| +-----+ - Example 5: Parsing JSON into an ArrayType - >>> import pyspark.sql.functions as sf >>> from pyspark.sql.types import ArrayType, IntegerType >>> schema = ArrayType(IntegerType()) >>> df = spark.createDataFrame([(1, '''[1, 2, 3]''')], ("key", "value")) >>> df.select(sf.from_json(df.value, schema).alias("json")).show() +---------+ | json| +---------+ |[1, 2, 3]| +---------+ - Example 6: Parsing JSON with specified options - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, '''{a:123}'''), (2, '''{"a":456}''')], ("key", "value")) >>> parsed1 = sf.from_json(df.value, "a INT") >>> parsed2 = sf.from_json(df.value, "a INT", {"allowUnquotedFieldNames": "true"}) >>> df.select("value", parsed1, parsed2).show() +---------+----------------+----------------+ | value|from_json(value)|from_json(value)| +---------+----------------+----------------+ | {a:123}| {NULL}| {123}| |{"a":456}| {456}| {456}| +---------+----------------+----------------+