# pyspark.sql.DataFrameStatFunctions¶

class pyspark.sql.DataFrameStatFunctions(df: pyspark.sql.dataframe.DataFrame)[source]

Functionality for statistic functions with DataFrame.

New in version 1.4.

Methods

 approxQuantile(col, probabilities, relativeError) Calculates the approximate quantiles of numerical columns of a DataFrame. corr(col1, col2[, method]) Calculates the correlation of two columns of a DataFrame as a double value. cov(col1, col2) Calculate the sample covariance for the given columns, specified by their names, as a double value. crosstab(col1, col2) Computes a pair-wise frequency table of the given columns. freqItems(cols[, support]) Finding frequent items for columns, possibly with false positives. sampleBy(col, fractions[, seed]) Returns a stratified sample without replacement based on the fraction given on each stratum.