pyspark.pandas.CategoricalIndex.rename_categories

CategoricalIndex.rename_categories(new_categories: Union[list, dict, Callable], inplace: bool = False) → Optional[pyspark.pandas.indexes.category.CategoricalIndex][source]

Rename categories.

Parameters
new_categorieslist-like, dict-like or callable

New categories which will replace old categories.

  • list-like: all items must be unique and the number of items in the new categories must match the existing number of categories.

  • dict-like: specifies a mapping from old categories to new. Categories not contained in the mapping are passed through and extra categories in the mapping are ignored.

  • callable : a callable that is called on all items in the old categories and whose return values comprise the new categories.

inplacebool, default False

Whether or not to rename the categories inplace or return a copy of this categorical with renamed categories.

Deprecated since version 3.2.0.

Returns
catCategoricalIndex or None

Categorical with removed categories or None if inplace=True.

Raises
ValueError

If new categories are list-like and do not have the same number of items than the current categories or do not validate as categories

See also

reorder_categories

Reorder categories.

add_categories

Add new categories.

remove_categories

Remove the specified categories.

remove_unused_categories

Remove categories which are not used.

set_categories

Set the categories to the specified ones.

Examples

>>> idx = ps.CategoricalIndex(["a", "a", "b"])
>>> idx.rename_categories([0, 1])
CategoricalIndex([0, 0, 1], categories=[0, 1], ordered=False, dtype='category')

For dict-like new_categories, extra keys are ignored and categories not in the dictionary are passed through

>>> idx.rename_categories({'a': 'A', 'c': 'C'})
CategoricalIndex(['A', 'A', 'b'], categories=['A', 'b'], ordered=False, dtype='category')

You may also provide a callable to create the new categories

>>> idx.rename_categories(lambda x: x.upper())
CategoricalIndex(['A', 'A', 'B'], categories=['A', 'B'], ordered=False, dtype='category')