glm {SparkR} | R Documentation |

Fits a generalized linear model, similarly to R's glm().

glm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...) ## S4 method for signature 'formula,ANY,SparkDataFrame' glm(formula, family = gaussian, data, epsilon = 1e-06, maxit = 25, weightCol = NULL, var.power = 0, link.power = 1 - var.power)

`formula` |
a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'. |

`family` |
a description of the error distribution and link function to be used in the model.
This can be a character string naming a family function, a family function or
the result of a call to a family function. Refer R family at
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/family.html.
Currently these families are supported: |

`data` |
a SparkDataFrame or R's glm data for training. |

`weights` |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |

`subset` |
an optional vector specifying a subset of observations to be used in the fitting process. |

`na.action` |
a function which indicates what should happen
when the data contain |

`start` |
starting values for the parameters in the linear predictor. |

`etastart` |
starting values for the linear predictor. |

`mustart` |
starting values for the vector of means. |

`offset` |
this can be used to specify an |

`control` |
a list of parameters for controlling the fitting
process. For |

`model` |
a logical value indicating whether |

`method` |
the method to be used in fitting the model. The default
method User-supplied fitting functions can be supplied either as a function
or a character string naming a function, with a function which takes
the same arguments as |

`x,y` |
For |

`contrasts` |
an optional list. See the |

`...` |
For For |

`epsilon` |
positive convergence tolerance of iterations. |

`maxit` |
integer giving the maximal number of IRLS iterations. |

`weightCol` |
the weight column name. If this is not set or |

`var.power` |
the index of the power variance function in the Tweedie family. |

`link.power` |
the index of the power link function in the Tweedie family. |

`glm`

returns a fitted generalized linear model.

glm since 1.5.0

```
## Not run:
##D sparkR.session()
##D t <- as.data.frame(Titanic)
##D df <- createDataFrame(t)
##D model <- glm(Freq ~ Sex + Age, df, family = "gaussian")
##D summary(model)
## End(Not run)
```

[Package *SparkR* version 2.2.0 Index]