pyspark.sql.functions.schema_of_json

pyspark.sql.functions.schema_of_json(json, options={})[source]

Parses a JSON string and infers its schema in DDL format.

New in version 2.4.0.

Parameters:
jsonColumn or str

a JSON string or a foldable string column containing a JSON string.

optionsdict, optional

options to control parsing. accepts the same options as the JSON datasource

Changed in version 3.0: It accepts options parameter to control schema inferring.

Examples

>>> df = spark.range(1)
>>> df.select(schema_of_json(lit('{"a": 0}')).alias("json")).collect()
[Row(json='STRUCT<`a`: BIGINT>')]
>>> schema = schema_of_json('{a: 1}', {'allowUnquotedFieldNames':'true'})
>>> df.select(schema.alias("json")).collect()
[Row(json='STRUCT<`a`: BIGINT>')]