pyspark.pandas.DataFrame.to_dict#
- DataFrame.to_dict(orient='dict', into=<class 'dict'>)[source]#
- Convert the DataFrame to a dictionary. - The type of the key-value pairs can be customized with the parameters (see below). - Note - This method should only be used if the resulting pandas DataFrame is expected to be small, as all the data is loaded into the driver’s memory. - Parameters
- orientstr {‘dict’, ‘list’, ‘series’, ‘split’, ‘records’, ‘index’}
- Determines the type of the values of the dictionary. - ‘dict’ (default) : dict like {column -> {index -> value}} 
- ‘list’ : dict like {column -> [values]} 
- ‘series’ : dict like {column -> Series(values)} 
- ‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -> [values]} 
- ‘records’ : list like [{column -> value}, … , {column -> value}] 
- ‘index’ : dict like {index -> {column -> value}} 
 - Abbreviations are allowed. s indicates series and sp indicates split. 
- intoclass, default dict
- The collections.abc.Mapping subclass used for all Mappings in the return value. Can be the actual class or an empty instance of the mapping type you want. If you want a collections.defaultdict, you must pass it initialized. 
 
- Returns
- dict, list or collections.abc.Mapping
- Return a collections.abc.Mapping object representing the DataFrame. The resulting transformation depends on the orient parameter. 
 
 - Examples - >>> df = ps.DataFrame({'col1': [1, 2], ... 'col2': [0.5, 0.75]}, ... index=['row1', 'row2'], ... columns=['col1', 'col2']) >>> df col1 col2 row1 1 0.50 row2 2 0.75 - >>> df_dict = df.to_dict() >>> sorted([(key, sorted(values.items())) for key, values in df_dict.items()]) [('col1', [('row1', 1), ('row2', 2)]), ('col2', [('row1', 0.5), ('row2', 0.75)])] - You can specify the return orientation. - >>> df_dict = df.to_dict('series') >>> sorted(df_dict.items()) [('col1', row1 1 row2 2 Name: col1, dtype: int64), ('col2', row1 0.50 row2 0.75 Name: col2, dtype: float64)] - >>> df_dict = df.to_dict('split') >>> sorted(df_dict.items()) [('columns', ['col1', 'col2']), ('data', [[1..., 0.75]]), ('index', ['row1', 'row2'])] - >>> df_dict = df.to_dict('records') >>> [sorted(values.items()) for values in df_dict] [[('col1', 1...), ('col2', 0.5)], [('col1', 2...), ('col2', 0.75)]] - >>> df_dict = df.to_dict('index') >>> sorted([(key, sorted(values.items())) for key, values in df_dict.items()]) [('row1', [('col1', 1), ('col2', 0.5)]), ('row2', [('col1', 2), ('col2', 0.75)])] - You can also specify the mapping type. - >>> from collections import OrderedDict, defaultdict >>> df.to_dict(into=OrderedDict) OrderedDict(...) - If you want a defaultdict, you need to initialize it: - >>> dd = defaultdict(list) >>> df.to_dict('records', into=dd) [defaultdict(<class 'list'>, {'col..., 'col...}), defaultdict(<class 'list'>, {'col..., 'col...})]