pyspark.sql.DataFrameReader.format

DataFrameReader.format(source)[source]

Specifies the input data source format.

New in version 1.4.0.

Parameters
sourcestr

string, name of the data source, e.g. ‘json’, ‘parquet’.

Examples

>>> df = spark.read.format('json').load('python/test_support/sql/people.json')
>>> df.dtypes
[('age', 'bigint'), ('name', 'string')]