pyspark.pandas.DataFrame.rename#

DataFrame.rename(mapper=None, index=None, columns=None, axis='index', inplace=False, level=None, errors='ignore')[source]#

Alter axes labels. Function / dict values must be unique (1-to-1). Labels not contained in a dict / Series will be left as-is. Extra labels listed don’t throw an error.

Parameters

mapperdict-like or function: Dict-like or functions transformations to apply to that axis’ values. Use either mapper and axis to specify the axis to target with mapper, or index and columns.
indexdict-like or function: Alternative to specifying axis (“mapper, axis=0” is equivalent to “index=mapper”).
columnsdict-like or function: Alternative to specifying axis (“mapper, axis=1” is equivalent to “columns=mapper”).
axisint or str, default ‘index’: Axis to target with mapper. Can be either the axis name (‘index’, ‘columns’) or number (0, 1).
inplacebool, default False: Whether to return a new DataFrame.
levelint or level name, default None: In case of a MultiIndex, only rename labels in the specified level.
errors{‘ignore’, ‘raise’}, default ‘ignore’: If ‘raise’, raise a KeyError when a dict-like mapper, index, or columns contains labels that are not present in the Index being transformed. If ‘ignore’, existing keys will be renamed, and extra keys will be ignored.

Returns

DataFrame with the renamed axis labels.

Raises

KeyError: If any of the labels is not found in the selected axis and “errors=’raise’”.

Examples

>>> psdf1 = ps.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
>>> psdf1.rename(columns={"A": "a", "B": "c"})  
   a  c
0  1  4
1  2  5
2  3  6

>>> psdf1.rename(index={1: 10, 2: 20})  
    A  B
0   1  4
10  2  5
20  3  6

>>> psdf1.rename(columns={"A": "a", "C": "c"}, errors="raise")
Traceback (most recent call last):
    ...
KeyError: 'Index include value which is not in the `mapper`'

>>> def str_lower(s) -> str:
...     return str.lower(s)
>>> psdf1.rename(str_lower, axis='columns')  
   a  b
0  1  4
1  2  5
2  3  6

>>> def mul10(x) -> int:
...     return x * 10
>>> psdf1.rename(mul10, axis='index')  
    A  B
0   1  4
10  2  5
20  3  6

>>> idx = pd.MultiIndex.from_tuples([('X', 'A'), ('X', 'B'), ('Y', 'C'), ('Y', 'D')])
>>> psdf2 = ps.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], columns=idx)
>>> psdf2.rename(columns=str_lower, level=0)  
   x     y
   A  B  C  D
0  1  2  3  4
1  5  6  7  8

>>> psdf3 = ps.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8]], index=idx, columns=list('ab'))
>>> psdf3.rename(index=str_lower)  
     a  b
x a  1  2
  b  3  4
y c  5  6
  d  7  8