glm,formula,ANY,SparkDataFrame-method {SparkR}R Documentation

Generalized Linear Models (R-compliant)

Description

Fits a generalized linear model, similarly to R's glm().

Usage

## S4 method for signature 'formula,ANY,SparkDataFrame'
glm(formula, family = gaussian,
  data, epsilon = 1e-06, maxit = 25, weightCol = NULL,
  var.power = 0, link.power = 1 - var.power)

Arguments

formula

a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. Refer R family at https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html. Currently these families are supported: binomial, gaussian, poisson, Gamma, and tweedie.

data

a SparkDataFrame or R's glm data for training.

epsilon

positive convergence tolerance of iterations.

maxit

integer giving the maximal number of IRLS iterations.

weightCol

the weight column name. If this is not set or NULL, we treat all instance weights as 1.0.

var.power

the index of the power variance function in the Tweedie family.

link.power

the index of the power link function in the Tweedie family.

Value

glm returns a fitted generalized linear model.

Note

glm since 1.5.0

See Also

spark.glm

Examples

## Not run: 
##D sparkR.session()
##D t <- as.data.frame(Titanic)
##D df <- createDataFrame(t)
##D model <- glm(Freq ~ Sex + Age, df, family = "gaussian")
##D summary(model)
## End(Not run)

[Package SparkR version 2.2.3 Index]