pyspark.pandas.DataFrame.replace#
- DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')[source]#
Returns a new DataFrame replacing a value with another value.
- Parameters
- to_replaceint, float, string, list, tuple or dict
Value to be replaced.
- valueint, float, string, list or tuple
Value to use to replace holes. The replacement value must be an int, float, or string. If value is a list or tuple, value should be of the same length with to_replace.
- inplaceboolean, default False
Fill in place (do not create a new object)
- limitint, default None
Maximum size gap to forward or backward fill.
Deprecated since version 4.0.0.
- regexbool or str, default False
Whether to interpret to_replace and/or value as regular expressions. If this is True then to_replace must be a string. Alternatively, this could be a regular expression in which case to_replace must be None.
- method‘pad’, default None
The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.
Deprecated since version 4.0.0.
- Returns
- DataFrame
Object after replacement.
Examples
>>> df = ps.DataFrame({"name": ['Ironman', 'Captain America', 'Thor', 'Hulk'], ... "weapon": ['Mark-45', 'Shield', 'Mjolnir', 'Smash']}, ... columns=['name', 'weapon']) >>> df name weapon 0 Ironman Mark-45 1 Captain America Shield 2 Thor Mjolnir 3 Hulk Smash
Scalar to_replace and value
>>> df.replace('Ironman', 'War-Machine') name weapon 0 War-Machine Mark-45 1 Captain America Shield 2 Thor Mjolnir 3 Hulk Smash
List like to_replace and value
>>> df.replace(['Ironman', 'Captain America'], ['Rescue', 'Hawkeye'], inplace=True) >>> df name weapon 0 Rescue Mark-45 1 Hawkeye Shield 2 Thor Mjolnir 3 Hulk Smash
Dicts can be used to specify different replacement values for different existing values To use a dict in this way the value parameter should be None
>>> df.replace({'Mjolnir': 'Stormbuster'}) name weapon 0 Rescue Mark-45 1 Hawkeye Shield 2 Thor Stormbuster 3 Hulk Smash
Dict can specify that different values should be replaced in different columns The value parameter should not be None in this case
>>> df.replace({'weapon': 'Mjolnir'}, 'Stormbuster') name weapon 0 Rescue Mark-45 1 Hawkeye Shield 2 Thor Stormbuster 3 Hulk Smash
Nested dictionaries The value parameter should be None to use a nested dict in this way
>>> df.replace({'weapon': {'Mjolnir': 'Stormbuster'}}) name weapon 0 Rescue Mark-45 1 Hawkeye Shield 2 Thor Stormbuster 3 Hulk Smash