Package org.apache.spark.ml.regression
Class GeneralizedLinearRegressionModel
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.Model<M>
org.apache.spark.ml.PredictionModel<FeaturesType,M>
org.apache.spark.ml.regression.RegressionModel<Vector,GeneralizedLinearRegressionModel>
org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,Params,HasAggregationDepth,HasFeaturesCol,HasFitIntercept,HasLabelCol,HasMaxIter,HasPredictionCol,HasRegParam,HasSolver,HasTol,HasWeightCol,PredictorParams,GeneralizedLinearRegressionBase,HasTrainingSummary<GeneralizedLinearRegressionTrainingSummary>,Identifiable,MLWritable
public class GeneralizedLinearRegressionModel
extends RegressionModel<Vector,GeneralizedLinearRegressionModel>
implements GeneralizedLinearRegressionBase, MLWritable, HasTrainingSummary<GeneralizedLinearRegressionTrainingSummary>
Model produced by
GeneralizedLinearRegression.- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Method Summary
Modifier and TypeMethodDescriptionfinal IntParamParam for suggested depth for treeAggregate (>= 2).Creates a copy of this instance with the same UID and some extra params.Evaluate the model on the given dataset, returning a summary of the results.family()Param for the name of family which is a description of the error distribution to be used in the model.final BooleanParamParam for whether to fit an intercept term.doublelink()Param for the name of link function which provides the relationship between the linear predictor and the mean of the distribution function.final DoubleParamParam for the index in the power link function.Param for link prediction (linear predictor) column name.final IntParammaxIter()Param for maximum number of iterations (>= 0).intReturns the number of features the model was trained on.Param for offset column name.doublePredict label for the given features.read()final DoubleParamregParam()Param for regularization parameter (>= 0).setLinkPredictionCol(String value) Sets the link prediction (linear predictor) column name.solver()The solver algorithm for optimization.summary()Gets R-like summary of model on training set.final DoubleParamtol()Param for the convergence tolerance for iterative algorithms (>= 0).toString()Transforms dataset by reading fromPredictionModel.featuresCol(), callingpredict, and storing the predictions as a new columnPredictionModel.predictionCol().uid()An immutable unique ID for the object and its derivatives.final DoubleParamParam for the power in the variance function of the Tweedie distribution which provides the relationship between the variance and mean of the distribution.Param for weight column name.write()Returns aMLWriterinstance for this ML instance.Methods inherited from class org.apache.spark.ml.PredictionModel
featuresCol, labelCol, predictionCol, setFeaturesCol, setPredictionCol, transformSchemaMethods inherited from class org.apache.spark.ml.Transformer
transform, transform, transformMethods inherited from class org.apache.spark.ml.PipelineStage
paramsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.ml.regression.GeneralizedLinearRegressionBase
getFamily, getLink, getLinkPower, getLinkPredictionCol, getOffsetCol, getVariancePower, hasLinkPredictionCol, hasOffsetCol, hasWeightCol, validateAndTransformSchemaMethods inherited from interface org.apache.spark.ml.param.shared.HasAggregationDepth
getAggregationDepthMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
featuresCol, getFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasFitIntercept
getFitInterceptMethods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelCol, labelColMethods inherited from interface org.apache.spark.ml.param.shared.HasMaxIter
getMaxIterMethods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol, predictionColMethods inherited from interface org.apache.spark.ml.param.shared.HasRegParam
getRegParamMethods inherited from interface org.apache.spark.ml.util.HasTrainingSummary
hasSummary, setSummaryMethods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightColMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritable
saveMethods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Method Details
-
read
-
load
-
family
Description copied from interface:GeneralizedLinearRegressionBaseParam for the name of family which is a description of the error distribution to be used in the model. Supported options: "gaussian", "binomial", "poisson", "gamma" and "tweedie". Default is "gaussian".- Specified by:
familyin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
variancePower
Description copied from interface:GeneralizedLinearRegressionBaseParam for the power in the variance function of the Tweedie distribution which provides the relationship between the variance and mean of the distribution. Only applicable to the Tweedie family. (see Tweedie Distribution (Wikipedia)) Supported values: 0 and [1, Inf). Note that variance power 0, 1, or 2 corresponds to the Gaussian, Poisson or Gamma family, respectively.- Specified by:
variancePowerin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
link
Description copied from interface:GeneralizedLinearRegressionBaseParam for the name of link function which provides the relationship between the linear predictor and the mean of the distribution function. Supported options: "identity", "log", "inverse", "logit", "probit", "cloglog" and "sqrt". This is used only when family is not "tweedie". The link function for the "tweedie" family must be specified throughGeneralizedLinearRegressionBase.linkPower().- Specified by:
linkin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
linkPower
Description copied from interface:GeneralizedLinearRegressionBaseParam for the index in the power link function. Only applicable to the Tweedie family. Note that link power 0, 1, -1 or 0.5 corresponds to the Log, Identity, Inverse or Sqrt link, respectively. When not set, this value defaults to 1 -GeneralizedLinearRegressionBase.variancePower(), which matches the R "statmod" package.- Specified by:
linkPowerin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
linkPredictionCol
Description copied from interface:GeneralizedLinearRegressionBaseParam for link prediction (linear predictor) column name. Default is not set, which means we do not output link prediction.- Specified by:
linkPredictionColin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
offsetCol
Description copied from interface:GeneralizedLinearRegressionBaseParam for offset column name. If this is not set or empty, we treat all instance offsets as 0.0. The feature specified as offset has a constant coefficient of 1.0.- Specified by:
offsetColin interfaceGeneralizedLinearRegressionBase- Returns:
- (undocumented)
-
solver
Description copied from interface:GeneralizedLinearRegressionBaseThe solver algorithm for optimization. Supported options: "irls" (iteratively reweighted least squares). Default: "irls"- Specified by:
solverin interfaceGeneralizedLinearRegressionBase- Specified by:
solverin interfaceHasSolver- Returns:
- (undocumented)
-
aggregationDepth
Description copied from interface:HasAggregationDepthParam for suggested depth for treeAggregate (>= 2).- Specified by:
aggregationDepthin interfaceHasAggregationDepth- Returns:
- (undocumented)
-
weightCol
Description copied from interface:HasWeightColParam for weight column name. If this is not set or empty, we treat all instance weights as 1.0.- Specified by:
weightColin interfaceHasWeightCol- Returns:
- (undocumented)
-
regParam
Description copied from interface:HasRegParamParam for regularization parameter (>= 0).- Specified by:
regParamin interfaceHasRegParam- Returns:
- (undocumented)
-
tol
Description copied from interface:HasTolParam for the convergence tolerance for iterative algorithms (>= 0). -
maxIter
Description copied from interface:HasMaxIterParam for maximum number of iterations (>= 0).- Specified by:
maxIterin interfaceHasMaxIter- Returns:
- (undocumented)
-
fitIntercept
Description copied from interface:HasFitInterceptParam for whether to fit an intercept term.- Specified by:
fitInterceptin interfaceHasFitIntercept- Returns:
- (undocumented)
-
uid
Description copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
uidin interfaceIdentifiable- Returns:
- (undocumented)
-
coefficients
-
intercept
public double intercept() -
setLinkPredictionCol
Sets the link prediction (linear predictor) column name.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
predict
Description copied from class:PredictionModelPredict label for the given features. This method is used to implementtransform()and outputPredictionModel.predictionCol().- Specified by:
predictin classPredictionModel<Vector,GeneralizedLinearRegressionModel> - Parameters:
features- (undocumented)- Returns:
- (undocumented)
-
transform
Description copied from class:PredictionModelTransforms dataset by reading fromPredictionModel.featuresCol(), callingpredict, and storing the predictions as a new columnPredictionModel.predictionCol().- Overrides:
transformin classPredictionModel<Vector,GeneralizedLinearRegressionModel> - Parameters:
dataset- input dataset- Returns:
- transformed dataset with
PredictionModel.predictionCol()of typeDouble
-
summary
Gets R-like summary of model on training set. An exception is thrown if there is no summary available.- Specified by:
summaryin interfaceHasTrainingSummary<GeneralizedLinearRegressionTrainingSummary>- Returns:
- (undocumented)
-
evaluate
Evaluate the model on the given dataset, returning a summary of the results.- Parameters:
dataset- (undocumented)- Returns:
- (undocumented)
-
copy
Description copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
copyin interfaceParams- Specified by:
copyin classModel<GeneralizedLinearRegressionModel>- Parameters:
extra- (undocumented)- Returns:
- (undocumented)
-
write
Returns aMLWriterinstance for this ML instance.For
GeneralizedLinearRegressionModel, this does NOT currently save the trainingsummary(). An option to savesummary()may be added in the future.- Specified by:
writein interfaceMLWritable- Returns:
- (undocumented)
-
numFeatures
public int numFeatures()Description copied from class:PredictionModelReturns the number of features the model was trained on. If unknown, returns -1- Overrides:
numFeaturesin classPredictionModel<Vector,GeneralizedLinearRegressionModel>
-
toString
- Specified by:
toStringin interfaceIdentifiable- Overrides:
toStringin classObject
-