pyspark.pandas.DataFrame.plot.bar

plot.bar(x=None, y=None, **kwds)

Vertical bar plot.

Parameters
xlabel or position, optional

Allows plotting of one column versus another. If not specified, the index of the DataFrame is used.

ylabel or position, optional

Allows plotting of one column versus another. If not specified, all numerical columns are used.

**kwdsoptional

Additional keyword arguments are documented in pyspark.pandas.Series.plot() or pyspark.pandas.DataFrame.plot().

Returns
plotly.graph_objs.Figure

Return an custom object when backend!=plotly. Return an ndarray when subplots=True (matplotlib-only).

Examples

Basic plot.

For Series:

>>> s = ps.Series([1, 3, 2])
>>> s.plot.bar()  

For DataFrame:

>>> df = ps.DataFrame({'lab': ['A', 'B', 'C'], 'val': [10, 30, 20]})
>>> df.plot.bar(x='lab', y='val')  

Plot a whole dataframe to a bar plot. Each column is stacked with a distinct color along the horizontal axis.

>>> speed = [0.1, 17.5, 40, 48, 52, 69, 88]
>>> lifespan = [2, 8, 70, 1.5, 25, 12, 28]
>>> index = ['snail', 'pig', 'elephant',
...          'rabbit', 'giraffe', 'coyote', 'horse']
>>> df = ps.DataFrame({'speed': speed,
...                    'lifespan': lifespan}, index=index)
>>> df.plot.bar()  

Instead of stacking, the figure can be split by column with plotly APIs.

>>> from plotly.subplots import make_subplots
>>> speed = [0.1, 17.5, 40, 48, 52, 69, 88]
>>> lifespan = [2, 8, 70, 1.5, 25, 12, 28]
>>> index = ['snail', 'pig', 'elephant',
...          'rabbit', 'giraffe', 'coyote', 'horse']
>>> df = ps.DataFrame({'speed': speed,
...                    'lifespan': lifespan}, index=index)
>>> fig = (make_subplots(rows=2, cols=1)
...        .add_trace(df.plot.bar(y='speed').data[0], row=1, col=1)
...        .add_trace(df.plot.bar(y='speed').data[0], row=1, col=1)
...        .add_trace(df.plot.bar(y='lifespan').data[0], row=2, col=1))
>>> fig  

Plot a single column.

>>> speed = [0.1, 17.5, 40, 48, 52, 69, 88]
>>> lifespan = [2, 8, 70, 1.5, 25, 12, 28]
>>> index = ['snail', 'pig', 'elephant',
...          'rabbit', 'giraffe', 'coyote', 'horse']
>>> df = ps.DataFrame({'speed': speed,
...                    'lifespan': lifespan}, index=index)
>>> df.plot.bar(y='speed')  

Plot only selected categories for the DataFrame.

>>> speed = [0.1, 17.5, 40, 48, 52, 69, 88]
>>> lifespan = [2, 8, 70, 1.5, 25, 12, 28]
>>> index = ['snail', 'pig', 'elephant',
...          'rabbit', 'giraffe', 'coyote', 'horse']
>>> df = ps.DataFrame({'speed': speed,
...                    'lifespan': lifespan}, index=index)
>>> df.plot.bar(x='lifespan')