Class NaiveBayes
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Estimator<M>
org.apache.spark.ml.Predictor<FeaturesType,E,M>
  
org.apache.spark.ml.classification.Classifier<FeaturesType,E,M>
  
org.apache.spark.ml.classification.ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
  
org.apache.spark.ml.classification.NaiveBayes
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging,- ClassifierParams,- NaiveBayesParams,- ProbabilisticClassifierParams,- Params,- HasFeaturesCol,- HasLabelCol,- HasPredictionCol,- HasProbabilityCol,- HasRawPredictionCol,- HasThresholds,- HasWeightCol,- PredictorParams,- DefaultParamsWritable,- Identifiable,- MLWritable
public class NaiveBayes
extends ProbabilisticClassifier<Vector,NaiveBayes,NaiveBayesModel>
implements NaiveBayesParams, DefaultParamsWritable  
Naive Bayes Classifiers.
 It supports Multinomial NB
 (see 
 here)
 which can handle finitely supported discrete data. For example, by converting documents into
 TF-IDF vectors, it can be used for document classification. By making every vector a
 binary (0/1) data, it can also be used as Bernoulli NB
 (see 
 here).
 The input feature values for Multinomial NB and Bernoulli NB must be nonnegative.
 Since 3.0.0, it supports Complement NB which is an adaptation of the Multinomial NB. Specifically,
 Complement NB uses statistics from the complement of each class to compute the model's coefficients
 The inventors of Complement NB show empirically that the parameter estimates for CNB are more stable
 than those for Multinomial NB. Like Multinomial NB, the input feature values for Complement NB must
 be nonnegative.
 Since 3.0.0, it also supports Gaussian NB
 (see 
 here)
 which can handle continuous data.
- See Also:
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionCreates a copy of this instance with the same UID and some extra params.static NaiveBayesThe model type which is a string (case-sensitive).static MLReader<T>read()setModelType(String value) Set the model type using a string (case-sensitive).setSmoothing(double value) Set the smoothing parameter.setWeightCol(String value) Sets the value of paramweightCol().final DoubleParamThe smoothing parameter.uid()An immutable unique ID for the object and its derivatives.Param for weight column name.Methods inherited from class org.apache.spark.ml.classification.ProbabilisticClassifierprobabilityCol, setProbabilityCol, setThresholds, thresholdsMethods inherited from class org.apache.spark.ml.classification.ClassifierrawPredictionCol, setRawPredictionColMethods inherited from class org.apache.spark.ml.PredictorfeaturesCol, fit, labelCol, predictionCol, setFeaturesCol, setLabelCol, setPredictionCol, transformSchemaMethods inherited from class org.apache.spark.ml.PipelineStageparamsMethods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.ml.util.DefaultParamsWritablewriteMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesColfeaturesCol, getFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasLabelColgetLabelCol, labelColMethods inherited from interface org.apache.spark.ml.param.shared.HasPredictionColgetPredictionCol, predictionColMethods inherited from interface org.apache.spark.ml.param.shared.HasProbabilityColgetProbabilityColMethods inherited from interface org.apache.spark.ml.param.shared.HasRawPredictionColgetRawPredictionCol, rawPredictionColMethods inherited from interface org.apache.spark.ml.param.shared.HasThresholdsgetThresholdsMethods inherited from interface org.apache.spark.ml.param.shared.HasWeightColgetWeightColMethods inherited from interface org.apache.spark.ml.util.IdentifiabletoStringMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritablesaveMethods inherited from interface org.apache.spark.ml.classification.NaiveBayesParamsgetModelType, getSmoothingMethods inherited from interface org.apache.spark.ml.param.Paramsclear, copyValues, defaultCopy, defaultParamMap, estimateMatadataSize, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwnMethods inherited from interface org.apache.spark.ml.classification.ProbabilisticClassifierParamsvalidateAndTransformSchema
- 
Constructor Details- 
NaiveBayes
- 
NaiveBayespublic NaiveBayes()
 
- 
- 
Method Details- 
load
- 
read
- 
smoothingDescription copied from interface:NaiveBayesParamsThe smoothing parameter. (default = 1.0).- Specified by:
- smoothingin interface- NaiveBayesParams
- Returns:
- (undocumented)
 
- 
modelTypeDescription copied from interface:NaiveBayesParamsThe model type which is a string (case-sensitive). Supported options: "multinomial", "complement", "bernoulli", "gaussian". (default = multinomial)- Specified by:
- modelTypein interface- NaiveBayesParams
- Returns:
- (undocumented)
 
- 
weightColDescription copied from interface:HasWeightColParam for weight column name. If this is not set or empty, we treat all instance weights as 1.0.- Specified by:
- weightColin interface- HasWeightCol
- Returns:
- (undocumented)
 
- 
uidDescription copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
- uidin interface- Identifiable
- Returns:
- (undocumented)
 
- 
setSmoothingSet the smoothing parameter. Default is 1.0.- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
 
- 
setModelTypeSet the model type using a string (case-sensitive). Supported options: "multinomial", "complement", "bernoulli", and "gaussian". Default is "multinomial"- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
 
- 
setWeightColSets the value of paramweightCol(). If this is not set or empty, we treat all instance weights as 1.0. Default is not set, so all instances have weight one.- Parameters:
- value- (undocumented)
- Returns:
- (undocumented)
 
- 
copyDescription copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
- copyin interface- Params
- Specified by:
- copyin class- Predictor<Vector,- NaiveBayes, - NaiveBayesModel> 
- Parameters:
- extra- (undocumented)
- Returns:
- (undocumented)
 
 
-