pyspark.pandas.DataFrame.astype

DataFrame.astype(dtype: Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype, Dict[Union[Any, Tuple[Any, …]], Union[str, numpy.dtype, pandas.core.dtypes.base.ExtensionDtype]]]) → pyspark.pandas.frame.DataFrame[source]

Cast a pandas-on-Spark object to a specified dtype dtype.

Parameters
dtypedata type, or dict of column name -> data type

Use a numpy.dtype or Python type to cast entire pandas-on-Spark object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame’s columns to column-specific types.

Returns
castedsame type as caller

See also

to_datetime

Convert argument to datetime.

Examples

>>> df = ps.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]}, dtype='int64')
>>> df
   a  b
0  1  1
1  2  2
2  3  3

Convert to float type:

>>> df.astype('float')
     a    b
0  1.0  1.0
1  2.0  2.0
2  3.0  3.0

Convert to int64 type back:

>>> df.astype('int64')
   a  b
0  1  1
1  2  2
2  3  3

Convert column a to float type:

>>> df.astype({'a': float})
     a  b
0  1.0  1
1  2.0  2
2  3.0  3