pyspark.pandas.DataFrame.from_records¶

static DataFrame.from_records(data: Union[numpy.ndarray, List[tuple], dict, pandas.core.frame.DataFrame], index: Union[str, list, numpy.ndarray] = None, exclude: list = None, columns: list = None, coerce_float: bool = False, nrows: int = None) → pyspark.pandas.frame.DataFrame[source]¶

Convert structured or record ndarray to DataFrame.

Parameters

datandarray (structured dtype), list of tuples, dict, or DataFrame
indexstring, list of fields, array-like: Field of array to use as the index, alternately a specific set of input labels to use
excludesequence, default None: Columns or fields to exclude
columnssequence, default None: Column names to use. If the passed data do not have names associated with them, this argument provides names for the columns. Otherwise this argument indicates the order of the columns in the result (any names not found in the data will become all-NA columns)
coerce_floatboolean, default False: Attempt to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets
nrowsint, default None: Number of rows to read if data is an iterator

Returns

dfDataFrame

Examples

Use dict as input

>>> ps.DataFrame.from_records({'A': [1, 2, 3]})
   A
0  1
1  2
2  3

Use list of tuples as input

>>> ps.DataFrame.from_records([(1, 2), (3, 4)])
1
1  2
3  4

Use NumPy array as input

>>> ps.DataFrame.from_records(np.eye(3))
  1    2
1.0  0.0  0.0
0.0  1.0  0.0
0.0  0.0  1.0

pyspark.pandas.DataFrame.last_valid_index pyspark.pandas.DataFrame.info