describe {SparkR}R Documentation

summary

Description

Computes statistics for numeric columns. If no columns are given, this function computes statistics for all numerical columns.

Returns the summary of a model produced by glm(), similarly to R's summary().

Usage

## S4 method for signature 'DataFrame,character'
describe(x, col, ...)

## S4 method for signature 'DataFrame,ANY'
describe(x)

## S4 method for signature 'DataFrame'
summary(object, ...)

describe(x, col, ...)

summary(object, ...)

## S4 method for signature 'PipelineModel'
summary(object, ...)

Arguments

x

A DataFrame to be computed.

col

A string of name

...

Additional expressions

object

A fitted MLlib model

Value

A DataFrame

a list with 'devianceResiduals' and 'coefficients' components for gaussian family or a list with 'coefficients' component for binomial family.
For gaussian family: the 'devianceResiduals' gives the min/max deviance residuals of the estimation, the 'coefficients' gives the estimated coefficients and their estimated standard errors, t values and p-values. (It only available when model fitted by normal solver.)
For binomial family: the 'coefficients' gives the estimated coefficients. See summary.glm for more information.

See Also

Other DataFrame functions: $, $<-, select, select, select,DataFrame,Column-method, select,DataFrame,list-method, selectExpr; DataFrame-class, dataFrame, groupedData; [, [, [[, subset; agg, agg, count,GroupedData-method, summarize, summarize; arrange, arrange, arrange, orderBy, orderBy; as.data.frame, as.data.frame,DataFrame-method; attach, attach,DataFrame-method; cache; collect; colnames, colnames, colnames<-, colnames<-, columns, names, names<-; coltypes, coltypes, coltypes<-, coltypes<-; columns, dtypes, printSchema, schema, schema; count, nrow; dim; distinct, unique; dropna, dropna, fillna, fillna, na.omit, na.omit; dtypes; except, except; explain, explain; filter, filter, where, where; first, first; groupBy, groupBy, group_by, group_by; head; insertInto, insertInto; intersect, intersect; isLocal, isLocal; join; limit, limit; merge, merge; mutate, mutate, transform, transform; ncol; persist; printSchema; rbind, rbind, unionAll, unionAll; registerTempTable, registerTempTable; rename, rename, withColumnRenamed, withColumnRenamed; repartition; sample, sample, sample_frac, sample_frac; saveAsParquetFile, saveAsParquetFile, write.parquet, write.parquet; saveAsTable, saveAsTable; saveDF, saveDF, write.df, write.df, write.df; selectExpr; showDF, showDF; show, show, show,GroupedData-method; str; take; unpersist; withColumn, withColumn; write.json, write.json; write.text, write.text

Examples

## Not run: 
##D sc <- sparkR.init()
##D sqlContext <- sparkRSQL.init(sc)
##D path <- "path/to/file.json"
##D df <- read.json(sqlContext, path)
##D describe(df)
##D describe(df, "col1")
##D describe(df, "col1", "col2")
## End(Not run)
## Not run: 
##D model <- glm(y ~ x, trainingData)
##D summary(model)
## End(Not run)

[Package SparkR version 1.6.3 Index]