pyspark.sql.functions.get_json_object#
- pyspark.sql.functions.get_json_object(col, path)[source]#
- Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid. - New in version 1.6.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colColumnor str
- string column in json format 
- pathstr
- path to the json object to extract 
 
- col
- Returns
- Column
- string representation of given JSON object value. 
 
 - Examples - Example 1: Extract a json object from json string - >>> data = [("1", '''{"f1": "value1", "f2": "value2"}'''), ("2", '''{"f1": "value12"}''')] >>> df = spark.createDataFrame(data, ("key", "jstring")) >>> df.select(df.key, ... get_json_object(df.jstring, '$.f1').alias("c0"), ... get_json_object(df.jstring, '$.f2').alias("c1") ... ).show() +---+-------+------+ |key| c0| c1| +---+-------+------+ | 1| value1|value2| | 2|value12| NULL| +---+-------+------+ - Example 2: Extract a json object from json array - >>> data = [ ... ("1", '''[{"f1": "value1"},{"f1": "value2"}]'''), ... ("2", '''[{"f1": "value12"},{"f2": "value13"}]''') ... ] >>> df = spark.createDataFrame(data, ("key", "jarray")) >>> df.select(df.key, ... get_json_object(df.jarray, '$[0].f1').alias("c0"), ... get_json_object(df.jarray, '$[1].f2').alias("c1") ... ).show() +---+-------+-------+ |key| c0| c1| +---+-------+-------+ | 1| value1| NULL| | 2|value12|value13| +---+-------+-------+ - >>> df.select(df.key, ... get_json_object(df.jarray, '$[*].f1').alias("c0"), ... get_json_object(df.jarray, '$[*].f2').alias("c1") ... ).show() +---+-------------------+---------+ |key| c0| c1| +---+-------------------+---------+ | 1|["value1","value2"]| NULL| | 2| "value12"|"value13"| +---+-------------------+---------+