spark.naiveBayes {SparkR}R Documentation

Naive Bayes Models

Description

spark.naiveBayes fits a Bernoulli naive Bayes model against a SparkDataFrame. Users can call summary to print a summary of the fitted model, predict to make predictions on new data, and write.ml/read.ml to save/load fitted models. Only categorical data is supported.

Usage

spark.naiveBayes(data, formula, ...)

## S4 method for signature 'NaiveBayesModel'
predict(object, newData)

## S4 method for signature 'NaiveBayesModel'
summary(object, ...)

## S4 method for signature 'SparkDataFrame,formula'
spark.naiveBayes(data, formula,
  smoothing = 1, ...)

## S4 method for signature 'NaiveBayesModel,character'
write.ml(object, path,
  overwrite = FALSE)

Arguments

data

A SparkDataFrame of observations and labels for model fitting

formula

A symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

object

A naive Bayes model fitted by spark.naiveBayes

newData

A SparkDataFrame for testing

smoothing

Smoothing parameter

path

The directory where the model is saved

overwrite

Overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists.

Value

predict returns a SparkDataFrame containing predicted labeled in a column named "prediction"

summary returns a list containing apriori, the label distribution, and tables, conditional probabilities given the target label

spark.naiveBayes returns a fitted naive Bayes model

Note

predict(NaiveBayesModel) since 2.0.0

summary(NaiveBayesModel) since 2.0.0

spark.naiveBayes since 2.0.0

write.ml(NaiveBayesModel, character) since 2.0.0

See Also

e1071: https://cran.r-project.org/web/packages/e1071/

read.ml

Examples

## Not run: 
##D df <- createDataFrame(infert)
##D 
##D # fit a Bernoulli naive Bayes model
##D model <- spark.naiveBayes(df, education ~ ., smoothing = 0)
##D 
##D # get the summary of the model
##D summary(model)
##D 
##D # make predictions
##D predictions <- predict(model, df)
##D 
##D # save and load the model
##D path <- "path/to/model"
##D write.ml(model, path)
##D savedModel <- read.ml(path)
##D summary(savedModel)
## End(Not run)

[Package SparkR version 2.0.0 Index]