pyspark.sql.plot.core.PySparkPlotAccessor.hist#
- PySparkPlotAccessor.hist(column=None, bins=10, **kwargs)[source]#
- Draw one histogram of the DataFrame’s columns. - A histogram is a representation of the distribution of data. - Parameters
- column: str or list of str, optional
- Column name or list of names to be used for creating the histogram plot. If None (default), all numeric columns will be used. If no numeric columns exist, behavior may depend on the plot backend. 
- binsinteger, default 10
- Number of histogram bins to be used. 
- **kwargs
- Additional keyword arguments. 
 
- Returns
- plotly.graph_objs.Figure
 
 - Examples - >>> from pyspark.sql import SparkSession >>> spark = SparkSession.builder.getOrCreate() >>> data = [(5.1, 3.5, 0), (4.9, 3.0, 0), (7.0, 3.2, 1), (6.4, 3.2, 1), (5.9, 3.0, 2)] >>> columns = ["length", "width", "species"] >>> df = spark.createDataFrame(data, columns) >>> df.plot.hist(bins=4)