Index objects#

Index#

Index([data, dtype, copy, name, tupleize_cols])

pandas-on-Spark Index that corresponds to pandas Index logically.

Properties#

Index.is_monotonic_increasing

Return boolean if values in the object are monotonically increasing.

Index.is_monotonic_decreasing

Return boolean if values in the object are monotonically decreasing.

Index.is_unique

Return if the index has unique values.

Index.has_duplicates

If index has duplicates, return True, otherwise False.

Index.hasnans

Return True if it has any missing values.

Index.dtype

Return the dtype object of the underlying data.

Index.inferred_type

Return a string of the type inferred from the values.

Index.shape

Return a tuple of the shape of the underlying data.

Index.name

Return name of the Index.

Index.names

Return names of the Index.

Index.ndim

Return an int representing the number of array dimensions.

Index.size

Return an int representing the number of elements in this object.

Index.nlevels

Number of levels in Index & MultiIndex.

Index.empty

Returns true if the current object is empty.

Index.T

Return the transpose, For index, It will be index itself.

Index.values

Return an array representing the data in the Index.

Modifying and computations#

Index.all([axis, skipna])

Return whether all elements are True.

Index.any([axis])

Return whether any element is True.

Index.argmin()

Return a minimum argument indexer.

Index.argmax()

Return a maximum argument indexer.

Index.copy([name, deep])

Make a copy of this object.

Index.delete(loc)

Make new Index with passed location(-s) deleted.

Index.equals(other)

Determine if two Index objects contain the same elements.

Index.factorize([sort, use_na_sentinel])

Encode the object as an enumerated type or categorical variable.

Index.identical(other)

Similar to equals, but check that other comparable attributes are also equal.

Index.insert(loc, item)

Make new Index inserting new item at location.

Index.is_boolean()

Return if the current index type is a boolean type.

Index.is_categorical()

Return if the current index type is a categorical type.

Index.is_floating()

Return if the current index type is a floating type.

Index.is_integer()

Return if the current index type is an integer type.

Index.is_interval()

Return if the current index type is an interval type.

Index.is_numeric()

Return if the current index type is a numeric type.

Index.is_object()

Return if the current index type is an object type.

Index.drop(labels)

Make new Index with passed list of labels deleted.

Index.drop_duplicates([keep])

Return Index with duplicate values removed.

Index.min()

Return the minimum value of the Index.

Index.max()

Return the maximum value of the Index.

Index.map(mapper[, na_action])

Map values using input correspondence (a dict, Series, or function).

Index.rename(name[, inplace])

Alter Index or MultiIndex name.

Index.repeat(repeats)

Repeat elements of a Index/MultiIndex.

Index.take(indices)

Return the elements in the given positional indices along an axis.

Index.unique([level])

Return unique values in the index.

Index.nunique([dropna, approx, rsd])

Return number of unique elements in the object.

Index.value_counts([normalize, sort, ...])

Return a Series containing counts of unique values.

Compatibility with MultiIndex#

Index.set_names(names[, level, inplace])

Set Index or MultiIndex name.

Index.droplevel(level)

Return index with requested level(s) removed.

Missing Values#

Index.fillna(value)

Fill NA/NaN values with the specified value.

Index.dropna([how])

Return Index or MultiIndex without NA/NaN values

Index.isna()

Detect existing (non-missing) values.

Index.isnull()

Detect existing (non-missing) values.

Index.notna()

Detect existing (non-missing) values.

Index.notnull()

Detect existing (non-missing) values.

Conversion#

Index.astype(dtype)

Cast a pandas-on-Spark object to a specified dtype dtype.

Index.item()

Return the first element of the underlying data as a python scalar.

Index.to_list()

Return a list of the values.

Index.to_series([name])

Create a Series with both index and values equal to the index keys useful with map for returning an indexer based on an index.

Index.to_frame([index, name])

Create a DataFrame with a column containing the Index.

Index.view()

this is defined as a copy with the same identity

Index.to_numpy([dtype, copy])

A NumPy ndarray representing the values in this Index or MultiIndex.

CategoricalIndex#

CategoricalIndex([data, categories, ...])

Index based on an underlying Categorical.

Categorical components#

CategoricalIndex.codes

The category codes of this categorical.

CategoricalIndex.categories

The categories of this categorical.

CategoricalIndex.ordered

Whether the categories have an ordered relationship.

CategoricalIndex.rename_categories(...)

Rename categories.

CategoricalIndex.reorder_categories(...[, ...])

Reorder categories as specified in new_categories.

CategoricalIndex.add_categories(new_categories)

Add new categories.

CategoricalIndex.remove_categories(removals)

Remove the specified categories.

CategoricalIndex.remove_unused_categories()

Remove categories which are not used.

CategoricalIndex.set_categories(new_categories)

Set the categories to the specified new_categories.

CategoricalIndex.as_ordered()

Set the Categorical to be ordered.

CategoricalIndex.as_unordered()

Set the Categorical to be unordered.

CategoricalIndex.map(mapper)

Map values using input correspondence (a dict, Series, or function).

CategoricalIndex.equals(other)

Determine if two Index objects contain the same elements.

CategoricalIndex.max()

Return the maximum value of the Index.

CategoricalIndex.min()

Return the minimum value of the Index.

CategoricalIndex.tolist()

Return a list of the values.

MultiIndex#

MultiIndex([levels, codes, sortorder, ...])

pandas-on-Spark MultiIndex that corresponds to pandas MultiIndex logically.

MultiIndex Constructors#

MultiIndex.from_arrays(arrays[, sortorder, ...])

Convert arrays to MultiIndex.

MultiIndex.from_tuples(tuples[, sortorder, ...])

Convert list of tuples to MultiIndex.

MultiIndex.from_product(iterables[, ...])

Make a MultiIndex from the cartesian product of multiple iterables.

MultiIndex.from_frame(df[, names])

Make a MultiIndex from a DataFrame.

MultiIndex Properties#

MultiIndex.has_duplicates

If index has duplicates, return True, otherwise False.

MultiIndex.hasnans

Return True if it has any missing values.

MultiIndex.inferred_type

Return a string of the type inferred from the values.

MultiIndex.shape

Return a tuple of the shape of the underlying data.

MultiIndex.names

Return names of the Index.

MultiIndex.ndim

Return an int representing the number of array dimensions.

MultiIndex.empty

Returns true if the current object is empty.

MultiIndex.T

Return the transpose, For index, It will be index itself.

MultiIndex.size

Return an int representing the number of elements in this object.

MultiIndex.nlevels

Number of levels in Index & MultiIndex.

MultiIndex.levshape

A tuple with the length of each level.

MultiIndex.values

Return an array representing the data in the Index.

MultiIndex.dtypes

Return the dtypes as a Series for the underlying MultiIndex.

MultiIndex components#

MultiIndex.swaplevel([i, j])

Swap level i with level j.

MultiIndex components#

MultiIndex.droplevel(level)

Return index with requested level(s) removed.

MultiIndex Missing Values#

MultiIndex.fillna(value)

Fill NA/NaN values with the specified value.

MultiIndex.dropna([how])

Return Index or MultiIndex without NA/NaN values

MultiIndex Modifying and computations#

MultiIndex.equals(other)

Determine if two Index objects contain the same elements.

MultiIndex.equal_levels(other)

Return True if the levels of both MultiIndex objects are the same

MultiIndex.identical(other)

Similar to equals, but check that other comparable attributes are also equal.

MultiIndex.insert(loc, item)

Make new MultiIndex inserting new item at location.

MultiIndex.drop(codes[, level])

Make new MultiIndex with passed list of labels deleted

MultiIndex.copy([deep])

Make a copy of this object.

MultiIndex.delete(loc)

Make new Index with passed location(-s) deleted.

MultiIndex.rename(name[, inplace])

Alter Index or MultiIndex name.

MultiIndex.repeat(repeats)

Repeat elements of a Index/MultiIndex.

MultiIndex.take(indices)

Return the elements in the given positional indices along an axis.

MultiIndex.unique([level])

Return unique values in the index.

MultiIndex.min()

Return the minimum value of the Index.

MultiIndex.max()

Return the maximum value of the Index.

MultiIndex.value_counts([normalize, sort, ...])

Return a Series containing counts of unique values.

MultiIndex Combining / joining / set operations#

MultiIndex.append(other)

Append a collection of Index options together.

MultiIndex.intersection(other)

Form the intersection of two Index objects.

MultiIndex.union(other[, sort])

Form the union of two Index objects.

MultiIndex.difference(other[, sort])

Return a new Index with elements from the index that are not in other.

MultiIndex.symmetric_difference(other[, ...])

Compute the symmetric difference of two MultiIndex objects.

MultiIndex Conversion#

MultiIndex.astype(dtype)

Cast a pandas-on-Spark object to a specified dtype dtype.

MultiIndex.item()

Return the first element of the underlying data as a python tuple.

MultiIndex.to_list()

Return a list of the values.

MultiIndex.to_series([name])

Create a Series with both index and values equal to the index keys useful with map for returning an indexer based on an index.

MultiIndex.to_frame([index, name])

Create a DataFrame with the levels of the MultiIndex as columns.

MultiIndex.view()

this is defined as a copy with the same identity

MultiIndex.to_numpy([dtype, copy])

A NumPy ndarray representing the values in this Index or MultiIndex.

DatatimeIndex#

DatetimeIndex([data, freq, normalize, ...])

Immutable ndarray-like of datetime64 data.

Time/date components#

DatetimeIndex.year

The year of the datetime.

DatetimeIndex.month

The month of the timestamp as January = 1 December = 12.

DatetimeIndex.day

The days of the datetime.

DatetimeIndex.hour

The hours of the datetime.

DatetimeIndex.minute

The minutes of the datetime.

DatetimeIndex.second

The seconds of the datetime.

DatetimeIndex.microsecond

The microseconds of the datetime.

DatetimeIndex.isocalendar()

Calculate year, week, and day according to the ISO 8601 standard.

DatetimeIndex.dayofweek

The day of the week with Monday=0, Sunday=6.

DatetimeIndex.day_of_week

The day of the week with Monday=0, Sunday=6.

DatetimeIndex.weekday

The day of the week with Monday=0, Sunday=6.

DatetimeIndex.dayofyear

The ordinal day of the year.

DatetimeIndex.day_of_year

The ordinal day of the year.

DatetimeIndex.quarter

The quarter of the date.

DatetimeIndex.is_month_start

Indicates whether the date is the first day of the month.

DatetimeIndex.is_month_end

Indicates whether the date is the last day of the month.

DatetimeIndex.is_quarter_start

Indicator for whether the date is the first day of a quarter.

DatetimeIndex.is_quarter_end

Indicator for whether the date is the last day of a quarter.

DatetimeIndex.is_year_start

Indicate whether the date is the first day of a year.

DatetimeIndex.is_year_end

Indicate whether the date is the last day of the year.

DatetimeIndex.is_leap_year

Boolean indicator if the date belongs to a leap year.

DatetimeIndex.daysinmonth

The number of days in the month.

DatetimeIndex.days_in_month

The number of days in the month.

Selecting#

DatetimeIndex.indexer_between_time(...[, ...])

Return index locations of values between particular times of day (example: 9:00-9:30AM).

DatetimeIndex.indexer_at_time(time[, asof])

Return index locations of values at particular time of day (example: 9:30AM).

Time-specific operations#

DatetimeIndex.normalize()

Convert times to midnight.

DatetimeIndex.strftime(date_format)

Convert to a string Index using specified date_format.

DatetimeIndex.round(freq, *args, **kwargs)

Perform round operation on the data to the specified freq.

DatetimeIndex.floor(freq, *args, **kwargs)

Perform floor operation on the data to the specified freq.

DatetimeIndex.ceil(freq, *args, **kwargs)

Perform ceil operation on the data to the specified freq.

DatetimeIndex.month_name([locale])

Return the month names of the DatetimeIndex with specified locale.

DatetimeIndex.day_name([locale])

Return the day names of the series with specified locale.

TimedeltaIndex#

TimedeltaIndex([data, unit, freq, closed, ...])

Immutable ndarray-like of timedelta64 data, represented internally as int64, and which can be boxed to timedelta objects.

Components#

TimedeltaIndex.days

Number of days for each element.

TimedeltaIndex.seconds

Number of seconds (>= 0 and less than 1 day) for each element.

TimedeltaIndex.microseconds

Number of microseconds (>= 0 and less than 1 second) for each element.