pyspark.pandas.Series.where

Series.where(cond: pyspark.pandas.series.Series, other: Any = nan) → pyspark.pandas.series.Series[source]

Replace values where the condition is False.

Parameters
condboolean Series

Where cond is True, keep the original value. Where False, replace with corresponding value from other.

otherscalar, Series

Entries where cond is False are replaced with corresponding value from other.

Returns
Series

Examples

>>> from pyspark.pandas.config import set_option, reset_option
>>> set_option("compute.ops_on_diff_frames", True)
>>> s1 = ps.Series([0, 1, 2, 3, 4])
>>> s2 = ps.Series([100, 200, 300, 400, 500])
>>> s1.where(s1 > 0).sort_index()
0    NaN
1    1.0
2    2.0
3    3.0
4    4.0
dtype: float64
>>> s1.where(s1 > 1, 10).sort_index()
0    10
1    10
2     2
3     3
4     4
dtype: int64
>>> s1.where(s1 > 1, s1 + 100).sort_index()
0    100
1    101
2      2
3      3
4      4
dtype: int64
>>> s1.where(s1 > 1, s2).sort_index()
0    100
1    200
2      2
3      3
4      4
dtype: int64
>>> reset_option("compute.ops_on_diff_frames")