pyspark.pandas.to_numeric

pyspark.pandas.to_numeric(arg)[source]

Convert argument to a numeric type.

Parameters
argscalar, list, tuple, 1-d array, or Series
Returns
retnumeric if parsing succeeded.

See also

DataFrame.astype

Cast argument to a specified dtype.

to_datetime

Convert argument to datetime.

to_timedelta

Convert argument to timedelta.

numpy.ndarray.astype

Cast a numpy array to a specified type.

Examples

>>> psser = ps.Series(['1.0', '2', '-3'])
>>> psser
0    1.0
1      2
2     -3
dtype: object
>>> ps.to_numeric(psser)
0    1.0
1    2.0
2   -3.0
dtype: float32

If given Series contains invalid value to cast float, just cast it to np.nan

>>> psser = ps.Series(['apple', '1.0', '2', '-3'])
>>> psser
0    apple
1      1.0
2        2
3       -3
dtype: object
>>> ps.to_numeric(psser)
0    NaN
1    1.0
2    2.0
3   -3.0
dtype: float32

Also support for list, tuple, np.array, or a scalar

>>> ps.to_numeric(['1.0', '2', '-3'])
array([ 1.,  2., -3.])
>>> ps.to_numeric(('1.0', '2', '-3'))
array([ 1.,  2., -3.])
>>> ps.to_numeric(np.array(['1.0', '2', '-3']))
array([ 1.,  2., -3.])
>>> ps.to_numeric('1.0')
1.0