describe {SparkR}R Documentation

summary

Description

Computes statistics for numeric columns. If no columns are given, this function computes statistics for all numerical columns.

Returns the summary of a model produced by glm() or spark.glm(), similarly to R's summary().

Returns the summary of a naive Bayes model produced by spark.naiveBayes(), similarly to R's summary().

Returns the summary of a k-means model produced by spark.kmeans(), similarly to R's summary().

Returns the summary of an AFT survival regression model produced by spark.survreg(), similarly to R's summary().

Usage

describe(x, col, ...)

summary(object, ...)

## S4 method for signature 'SparkDataFrame,character'
describe(x, col, ...)

## S4 method for signature 'SparkDataFrame,ANY'
describe(x)

## S4 method for signature 'SparkDataFrame'
summary(object, ...)

## S4 method for signature 'GeneralizedLinearRegressionModel'
summary(object, ...)

## S4 method for signature 'NaiveBayesModel'
summary(object, ...)

## S4 method for signature 'KMeansModel'
summary(object, ...)

## S4 method for signature 'AFTSurvivalRegressionModel'
summary(object, ...)

Arguments

x

A SparkDataFrame to be computed.

col

A string of name

...

Additional expressions

object

A fitted generalized linear model

object

A fitted MLlib model

object

a fitted k-means model

object

a fitted AFT survival regression model

Value

A SparkDataFrame

coefficients the model's coefficients, intercept

a list containing 'apriori', the label distribution, and 'tables', conditional

the model's coefficients, size and cluster

coefficients the model's coefficients, intercept and log(scale).

See Also

Other SparkDataFrame functions: SparkDataFrame-class, [[, agg, arrange, as.data.frame, attach, cache, collect, colnames, coltypes, columns, count, dapply, dim, distinct, dropDuplicates, dropna, drop, dtypes, except, explain, filter, first, group_by, head, histogram, insertInto, intersect, isLocal, join, limit, merge, mutate, ncol, persist, printSchema, registerTempTable, rename, repartition, sample, saveAsTable, selectExpr, select, showDF, show, str, take, unionAll, unpersist, withColumn, write.df, write.jdbc, write.json, write.parquet, write.text

Examples

## Not run: 
##D sc <- sparkR.init()
##D sqlContext <- sparkRSQL.init(sc)
##D path <- "path/to/file.json"
##D df <- read.json(sqlContext, path)
##D describe(df)
##D describe(df, "col1")
##D describe(df, "col1", "col2")
## End(Not run)
## Not run: 
##D model <- glm(y ~ x, trainingData)
##D summary(model)
## End(Not run)
## Not run: 
##D model <- spark.naiveBayes(trainingData, y ~ x)
##D summary(model)
## End(Not run)
## Not run: 
##D model <- spark.kmeans(trainingData, ~ ., 2)
##D summary(model)
## End(Not run)
## Not run: 
##D model <- spark.survreg(trainingData, Surv(futime, fustat) ~ ecog_ps + rx)
##D summary(model)
## End(Not run)

[Package SparkR version 2.0.0 Index]