pyspark.sql.functions.to_json#
- pyspark.sql.functions.to_json(col, options=None)[source]#
- Converts a column containing a - StructType,- ArrayTypeor a- MapTypeinto a JSON string. Throws an exception, in the case of an unsupported type.- New in version 2.1.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- colColumnor str
- name of column containing a struct, an array or a map. 
- optionsdict, optional
- options to control converting. accepts the same options as the JSON datasource. See Data Source Option for the version you use. Additionally the function supports the pretty option which enables pretty JSON generation. 
 
- col
- Returns
- Column
- JSON object as string column. 
 
 - Examples - Example 1: Converting a StructType column to JSON - >>> import pyspark.sql.functions as sf >>> from pyspark.sql import Row >>> data = [(1, Row(age=2, name='Alice'))] >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(sf.to_json(df.value).alias("json")).show(truncate=False) +------------------------+ |json | +------------------------+ |{"age":2,"name":"Alice"}| +------------------------+ - Example 2: Converting an ArrayType column to JSON - >>> import pyspark.sql.functions as sf >>> from pyspark.sql import Row >>> data = [(1, [Row(age=2, name='Alice'), Row(age=3, name='Bob')])] >>> df = spark.createDataFrame(data, ("key", "value")) >>> df.select(sf.to_json(df.value).alias("json")).show(truncate=False) +-------------------------------------------------+ |json | +-------------------------------------------------+ |[{"age":2,"name":"Alice"},{"age":3,"name":"Bob"}]| +-------------------------------------------------+ - Example 3: Converting a MapType column to JSON - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, {"name": "Alice"})], ("key", "value")) >>> df.select(sf.to_json(df.value).alias("json")).show(truncate=False) +----------------+ |json | +----------------+ |{"name":"Alice"}| +----------------+ - Example 4: Converting a nested MapType column to JSON - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, [{"name": "Alice"}, {"name": "Bob"}])], ("key", "value")) >>> df.select(sf.to_json(df.value).alias("json")).show(truncate=False) +---------------------------------+ |json | +---------------------------------+ |[{"name":"Alice"},{"name":"Bob"}]| +---------------------------------+ - Example 5: Converting a simple ArrayType column to JSON - >>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([(1, ["Alice", "Bob"])], ("key", "value")) >>> df.select(sf.to_json(df.value).alias("json")).show(truncate=False) +---------------+ |json | +---------------+ |["Alice","Bob"]| +---------------+ - Example 6: Converting to JSON with specified options - >>> import pyspark.sql.functions as sf >>> df = spark.sql("SELECT (DATE('2022-02-22'), 1) AS date") >>> json1 = sf.to_json(df.date) >>> json2 = sf.to_json(df.date, {"dateFormat": "yyyy/MM/dd"}) >>> df.select("date", json1, json2).show(truncate=False) +---------------+------------------------------+------------------------------+ |date |to_json(date) |to_json(date) | +---------------+------------------------------+------------------------------+ |{2022-02-22, 1}|{"col1":"2022-02-22","col2":1}|{"col1":"2022/02/22","col2":1}| +---------------+------------------------------+------------------------------+