Supported pandas API#
The following table shows the pandas APIs that implemented or non-implemented from pandas API on Spark. Some pandas API do not implement full parameters, so the third column shows missing parameters for each API.
‘Y’ in the second column means it’s implemented including its whole parameter.
‘N’ means it’s not implemented yet.
‘P’ means it’s partially implemented with the missing of some parameters.
All API in the list below computes the data with distributed execution except the ones that require the local execution by design. For example, DataFrame.to_numpy() requires to collect the data to the driver side.
If there is non-implemented pandas API or parameter you want, you can create an Apache Spark JIRA to request or to contribute by your own.
The API list is updated based on the latest pandas official API reference.
CategoricalIndex API#
API |
Implemented |
Missing parameters |
---|---|---|
Y |
||
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
argsort |
N |
|
Y |
||
Y |
||
|
Y |
|
asof_locs |
N |
|
|
P |
|
|
Y |
|
|
Y |
|
diff |
N |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
duplicated |
N |
|
Y |
||
|
Y |
|
|
P |
|
format |
N |
|
get_indexer |
N |
|
get_indexer_for |
N |
|
get_indexer_non_unique |
N |
|
|
Y |
|
get_loc |
N |
|
get_slice_bound |
N |
|
groupby |
N |
|
|
Y |
|
|
Y |
|
infer_objects |
N |
|
|
Y |
|
|
P |
|
is_ |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
join |
N |
|
P |
|
|
Y |
||
memory_usage |
N |
|
Y |
||
|
Y |
|
|
Y |
|
|
Y |
|
putmask |
N |
|
ravel |
N |
|
reindex |
N |
|
Y |
||
Y |
||
|
Y |
|
Y |
||
Y |
||
|
P |
|
round |
N |
|
searchsorted |
N |
|
Y |
||
|
Y |
|
|
P |
|
slice_indexer |
N |
|
slice_locs |
N |
|
|
Y |
|
|
P |
|
sortlevel |
N |
|
|
Y |
|
|
P |
|
to_flat_index |
N |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
Y |
||
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
where |
N |
DataFrame API#
API |
Implemented |
Missing parameters |
---|---|---|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
asfreq |
N |
|
asof |
N |
|
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
P |
|
|
combine |
N |
|
Y |
||
compare |
N |
|
convert_dtypes |
N |
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
|
P |
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
infer_objects |
N |
|
P |
|
|
Y |
||
P |
|
|
isetitem |
N |
|
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
Y |
||
Y |
||
P |
|
|
memory_usage |
N |
|
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
|
P |
|
P |
|
|
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
reorder_levels |
N |
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
Y |
||
set_axis |
N |
|
set_flags |
N |
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
|
P |
|
Y |
||
P |
|
|
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
P |
|
|
Y |
||
to_gbq |
N |
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
to_period |
N |
|
to_pickle |
N |
|
Y |
||
to_sql |
N |
|
Y |
||
P |
|
|
to_timestamp |
N |
|
to_xarray |
N |
|
to_xml |
N |
|
Y |
||
P |
|
|
P |
|
|
Y |
||
tz_convert |
N |
|
tz_localize |
N |
|
P |
|
|
P |
|
|
value_counts |
N |
|
P |
|
|
P |
|
|
P |
|
DatetimeIndex API#
API |
Implemented |
Missing parameters |
---|---|---|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
argsort |
N |
|
as_unit |
N |
|
|
Y |
|
asof_locs |
N |
|
|
P |
|
Y |
||
|
Y |
|
Y |
||
|
Y |
|
diff |
N |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
duplicated |
N |
|
|
Y |
|
|
Y |
|
|
P |
|
Y |
||
format |
N |
|
get_indexer |
N |
|
get_indexer_for |
N |
|
get_indexer_non_unique |
N |
|
|
Y |
|
get_loc |
N |
|
get_slice_bound |
N |
|
groupby |
N |
|
|
Y |
|
|
Y |
|
Y |
||
Y |
||
infer_objects |
N |
|
|
Y |
|
|
P |
|
is_ |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
Y |
||
|
Y |
|
join |
N |
|
|
Y |
|
|
P |
|
mean |
N |
|
memory_usage |
N |
|
|
P |
|
Y |
||
Y |
||
|
Y |
|
|
Y |
|
|
Y |
|
putmask |
N |
|
ravel |
N |
|
reindex |
N |
|
|
Y |
|
|
P |
|
Y |
||
searchsorted |
N |
|
|
Y |
|
|
P |
|
slice_indexer |
N |
|
slice_locs |
N |
|
snap |
N |
|
|
Y |
|
|
P |
|
sortlevel |
N |
|
std |
N |
|
Y |
||
|
Y |
|
|
P |
|
to_flat_index |
N |
|
|
Y |
|
to_julian_date |
N |
|
|
Y |
|
|
P |
|
to_period |
N |
|
to_pydatetime |
N |
|
|
P |
|
|
Y |
|
|
Y |
|
tz_convert |
N |
|
tz_localize |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
where |
N |
Index API#
API |
Implemented |
Missing parameters |
---|---|---|
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
argsort |
N |
|
Y |
||
asof_locs |
N |
|
P |
|
|
Y |
||
Y |
||
diff |
N |
|
Y |
||
P |
|
|
Y |
||
Y |
||
Y |
||
duplicated |
N |
|
Y |
||
Y |
||
P |
|
|
format |
N |
|
get_indexer |
N |
|
get_indexer_for |
N |
|
get_indexer_non_unique |
N |
|
|
Y |
|
get_loc |
N |
|
get_slice_bound |
N |
|
groupby |
N |
|
|
Y |
|
Y |
||
infer_objects |
N |
|
Y |
||
P |
|
|
is_ |
N |
|
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
Y |
||
Y |
||
Y |
||
join |
N |
|
Y |
||
P |
|
|
memory_usage |
N |
|
P |
|
|
Y |
||
Y |
||
Y |
||
putmask |
N |
|
ravel |
N |
|
reindex |
N |
|
Y |
||
P |
|
|
round |
N |
|
searchsorted |
N |
|
Y |
||
P |
|
|
slice_indexer |
N |
|
slice_locs |
N |
|
|
Y |
|
P |
|
|
sortlevel |
N |
|
Y |
||
P |
|
|
to_flat_index |
N |
|
Y |
||
Y |
||
P |
|
|
P |
|
|
|
Y |
|
|
Y |
|
Y |
||
Y |
||
Y |
||
Y |
||
where |
N |
MultiIndex API#
API |
Implemented |
Missing parameters |
---|---|---|
|
Y |
|
|
Y |
|
Y |
||
|
P |
|
|
P |
|
argsort |
N |
|
|
Y |
|
asof_locs |
N |
|
P |
|
|
P |
|
|
Y |
||
diff |
N |
|
Y |
||
P |
|
|
|
Y |
|
Y |
||
Y |
||
duplicated |
N |
|
Y |
||
Y |
||
|
P |
|
P |
|
|
format |
N |
|
get_indexer |
N |
|
get_indexer_for |
N |
|
get_indexer_non_unique |
N |
|
|
Y |
|
get_loc |
N |
|
get_loc_level |
N |
|
get_locs |
N |
|
get_slice_bound |
N |
|
groupby |
N |
|
|
Y |
|
Y |
||
infer_objects |
N |
|
Y |
||
P |
|
|
is_ |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
Y |
||
join |
N |
|
|
Y |
|
P |
|
|
memory_usage |
N |
|
P |
|
|
|
Y |
|
|
Y |
|
|
Y |
|
putmask |
N |
|
ravel |
N |
|
reindex |
N |
|
remove_unused_levels |
N |
|
P |
|
|
reorder_levels |
N |
|
P |
|
|
round |
N |
|
searchsorted |
N |
|
set_codes |
N |
|
set_levels |
N |
|
|
Y |
|
|
P |
|
slice_indexer |
N |
|
slice_locs |
N |
|
|
Y |
|
P |
|
|
sortlevel |
N |
|
Y |
||
Y |
||
P |
|
|
to_flat_index |
N |
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
|
Y |
|
|
Y |
|
truncate |
N |
|
Y |
||
Y |
||
Y |
||
Y |
||
where |
N |
Series API#
API |
Implemented |
Missing parameters |
---|---|---|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
asfreq |
N |
|
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
case_when |
N |
|
P |
|
|
combine |
N |
|
Y |
||
P |
|
|
convert_dtypes |
N |
|
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
|
P |
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
infer_objects |
N |
|
info |
N |
|
P |
|
|
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
Y |
||
Y |
||
Y |
||
memory_usage |
N |
|
Y |
||
P |
|
|
Y |
||
P |
|
|
|
P |
|
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
P |
|
|
P |
|
|
ravel |
N |
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
reorder_levels |
N |
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
||
set_axis |
N |
|
set_flags |
N |
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
P |
|
|
|
P |
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
to_period |
N |
|
to_pickle |
N |
|
to_sql |
N |
|
P |
|
|
to_timestamp |
N |
|
to_xarray |
N |
|
|
Y |
|
Y |
||
|
Y |
|
P |
|
|
Y |
||
tz_convert |
N |
|
tz_localize |
N |
|
Y |
||
P |
|
|
Y |
||
Y |
||
P |
|
|
view |
N |
|
P |
|
|
P |
|
TimedeltaIndex API#
API |
Implemented |
Missing parameters |
---|---|---|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
argsort |
N |
|
as_unit |
N |
|
|
Y |
|
asof_locs |
N |
|
|
P |
|
ceil |
N |
|
|
Y |
|
|
Y |
|
diff |
N |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
duplicated |
N |
|
|
Y |
|
|
Y |
|
|
P |
|
floor |
N |
|
format |
N |
|
get_indexer |
N |
|
get_indexer_for |
N |
|
get_indexer_non_unique |
N |
|
|
Y |
|
get_loc |
N |
|
get_slice_bound |
N |
|
groupby |
N |
|
|
Y |
|
|
Y |
|
infer_objects |
N |
|
|
Y |
|
|
P |
|
is_ |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
join |
N |
|
|
Y |
|
|
P |
|
mean |
N |
|
median |
N |
|
memory_usage |
N |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
putmask |
N |
|
ravel |
N |
|
reindex |
N |
|
|
Y |
|
|
P |
|
round |
N |
|
searchsorted |
N |
|
|
Y |
|
|
P |
|
slice_indexer |
N |
|
slice_locs |
N |
|
|
Y |
|
|
P |
|
sortlevel |
N |
|
std |
N |
|
sum |
N |
|
|
Y |
|
|
P |
|
to_flat_index |
N |
|
|
Y |
|
|
Y |
|
|
P |
|
to_pytimedelta |
N |
|
|
P |
|
|
Y |
|
total_seconds |
N |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
Y |
|
where |
N |
General Function API#
API |
Implemented |
Missing parameters |
---|---|---|
array |
N |
|
bdate_range |
N |
|
P |
|
|
crosstab |
N |
|
cut |
N |
|
P |
|
|
eval |
N |
|
factorize |
N |
|
from_dummies |
N |
|
Y |
||
infer_freq |
N |
|
interval_range |
N |
|
Y |
||
Y |
||
P |
|
|
lreshape |
N |
|
P |
|
|
P |
|
|
Y |
||
merge_ordered |
N |
|
Y |
||
Y |
||
period_range |
N |
|
pivot |
N |
|
pivot_table |
N |
|
qcut |
N |
|
P |
|
|
P |
|
|
P |
|
|
read_feather |
N |
|
read_fwf |
N |
|
read_gbq |
N |
|
read_hdf |
N |
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
read_pickle |
N |
|
read_sas |
N |
|
read_spss |
N |
|
P |
|
|
P |
|
|
P |
|
|
read_stata |
N |
|
P |
|
|
read_xml |
N |
|
set_eng_float_format |
N |
|
show_versions |
N |
|
test |
N |
|
P |
|
|
P |
|
|
P |
|
|
to_pickle |
N |
|
Y |
||
unique |
N |
|
value_counts |
N |
|
wide_to_long |
N |
Expanding API#
API |
Implemented |
Missing parameters |
---|---|---|
agg |
N |
|
aggregate |
N |
|
apply |
N |
|
corr |
N |
|
P |
|
|
cov |
N |
|
|
P |
|
P |
|
|
P |
|
|
median |
N |
|
P |
|
|
P |
|
|
rank |
N |
|
sem |
N |
|
|
P |
|
|
P |
|
P |
|
|
|
P |
|
ExpandingGroupby API#
API |
Implemented |
Missing parameters |
---|---|---|
agg |
N |
|
aggregate |
N |
|
apply |
N |
|
corr |
N |
|
|
P |
|
cov |
N |
|
|
P |
|
|
P |
|
|
P |
|
median |
N |
|
|
P |
|
|
P |
|
rank |
N |
|
sem |
N |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
Rolling API#
API |
Implemented |
Missing parameters |
---|---|---|
agg |
N |
|
aggregate |
N |
|
apply |
N |
|
corr |
N |
|
P |
|
|
cov |
N |
|
|
P |
|
P |
|
|
P |
|
|
median |
N |
|
P |
|
|
P |
|
|
rank |
N |
|
sem |
N |
|
|
P |
|
|
P |
|
P |
|
|
|
P |
|
RollingGroupby API#
API |
Implemented |
Missing parameters |
---|---|---|
agg |
N |
|
aggregate |
N |
|
apply |
N |
|
corr |
N |
|
|
P |
|
cov |
N |
|
|
P |
|
|
P |
|
|
P |
|
median |
N |
|
|
P |
|
|
P |
|
rank |
N |
|
sem |
N |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
Window API#
API |
Implemented |
Missing parameters |
---|---|---|
agg |
N |
|
aggregate |
N |
|
mean |
N |
|
std |
N |
|
sum |
N |
|
var |
N |
DataFrameGroupBy API#
API |
Implemented |
Missing parameters |
---|---|---|
P |
|
|
P |
|
|
|
Y |
|
|
P |
|
|
P |
|
|
Y |
|
boxplot |
N |
|
|
Y |
|
corrwith |
N |
|
|
Y |
|
cov |
N |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
P |
|
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
hist |
N |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
|
P |
|
ngroup |
N |
|
|
Y |
|
ohlc |
N |
|
pct_change |
N |
|
pipe |
N |
|
|
Y |
|
|
P |
|
|
P |
|
resample |
N |
|
|
Y |
|
sample |
N |
|
|
P |
|
|
P |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
take |
N |
|
|
P |
|
value_counts |
N |
|
|
P |
|
GroupBy API#
API |
Implemented |
Missing parameters |
---|---|---|
|
P |
|
|
P |
|
Y |
||
P |
|
|
P |
|
|
Y |
||
Y |
||
Y |
||
P |
|
|
P |
|
|
P |
|
|
P |
|
|
describe |
N |
|
P |
|
|
Y |
||
|
Y |
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
ngroup |
N |
|
ohlc |
N |
|
pct_change |
N |
|
pipe |
N |
|
Y |
||
P |
|
|
P |
|
|
resample |
N |
|
|
Y |
|
sample |
N |
|
P |
|
|
P |
|
|
Y |
||
P |
|
|
P |
|
|
Y |
||
P |
|
SeriesGroupBy API#
API |
Implemented |
Missing parameters |
---|---|---|
|
P |
|
|
P |
|
|
Y |
|
|
P |
|
|
Y |
|
|
Y |
|
corr |
N |
|
|
Y |
|
cov |
N |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
describe |
N |
|
|
P |
|
|
Y |
|
|
Y |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
hist |
N |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
|
P |
|
ngroup |
N |
|
P |
|
|
P |
|
|
|
Y |
|
ohlc |
N |
|
pct_change |
N |
|
pipe |
N |
|
|
Y |
|
|
P |
|
|
P |
|
resample |
N |
|
|
Y |
|
sample |
N |
|
|
P |
|
|
P |
|
|
Y |
|
|
P |
|
|
P |
|
|
P |
|
|
Y |
|
take |
N |
|
|
P |
|
Y |
||
P |
|
|
|
P |
|