glm {SparkR}R Documentation

Fits a generalized linear model

Description

Fits a generalized linear model, similarly to R's glm(). Also see the glmnet package.

Usage

glm(formula, family = gaussian, data, weights, subset, na.action,
  start = NULL, etastart, mustart, offset, control = list(...),
  model = TRUE, method = "glm.fit", x = FALSE, y = TRUE,
  contrasts = NULL, ...)

## S4 method for signature 'formula,ANY,DataFrame'
glm(formula, family = c("gaussian",
  "binomial"), data, lambda = 0, alpha = 0, standardize = TRUE,
  solver = "auto")

Arguments

formula

A symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

family

Error distribution. "gaussian" -> linear regression, "binomial" -> logistic reg.

data

DataFrame for training

lambda

Regularization parameter

alpha

Elastic-net mixing parameter (see glmnet's documentation for details)

standardize

Whether to standardize features before training

solver

The solver algorithm used for optimization, this can be "l-bfgs", "normal" and "auto". "l-bfgs" denotes Limited-memory BFGS which is a limited-memory quasi-Newton optimization method. "normal" denotes using Normal Equation as an analytical solution to the linear regression problem. The default value is "auto" which means that the solver algorithm is selected automatically.

Value

a fitted MLlib model

Examples

## Not run: 
##D sc <- sparkR.init()
##D sqlContext <- sparkRSQL.init(sc)
##D data(iris)
##D df <- createDataFrame(sqlContext, iris)
##D model <- glm(Sepal_Length ~ Sepal_Width, df, family="gaussian")
##D summary(model)
## End(Not run)

[Package SparkR version 1.6.1 Index]