Package org.apache.spark.ml.regression
Class LinearRegressionModel
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Transformer
org.apache.spark.ml.Model<M>
org.apache.spark.ml.PredictionModel<FeaturesType,M>
org.apache.spark.ml.regression.RegressionModel<Vector,LinearRegressionModel>
org.apache.spark.ml.regression.LinearRegressionModel
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
,Params
,HasAggregationDepth
,HasElasticNetParam
,HasFeaturesCol
,HasFitIntercept
,HasLabelCol
,HasLoss
,HasMaxBlockSizeInMB
,HasMaxIter
,HasPredictionCol
,HasRegParam
,HasSolver
,HasStandardization
,HasTol
,HasWeightCol
,PredictorParams
,LinearRegressionParams
,GeneralMLWritable
,HasTrainingSummary<LinearRegressionTrainingSummary>
,Identifiable
,MLWritable
public class LinearRegressionModel
extends RegressionModel<Vector,LinearRegressionModel>
implements LinearRegressionParams, GeneralMLWritable, HasTrainingSummary<LinearRegressionTrainingSummary>
Model produced by
LinearRegression
.- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Method Summary
Modifier and TypeMethodDescriptionfinal IntParam
Param for suggested depth for treeAggregate (>= 2).Creates a copy of this instance with the same UID and some extra params.final DoubleParam
Param for the ElasticNet mixing parameter, in range [0, 1].final DoubleParam
epsilon()
The shape parameter to control the amount of robustness.Evaluates the model on a test dataset.final BooleanParam
Param for whether to fit an intercept term.double
static LinearRegressionModel
loss()
The loss function to be optimized.final DoubleParam
Param for Maximum memory in MB for stacking input data into blocks.final IntParam
maxIter()
Param for maximum number of iterations (>= 0).int
Returns the number of features the model was trained on.double
Predict label for the given features.static MLReader<LinearRegressionModel>
read()
final DoubleParam
regParam()
Param for regularization parameter (>= 0).double
scale()
solver()
The solver algorithm for optimization.final BooleanParam
Param for whether to standardize the training features before fitting the model.summary()
Gets summary (e.g.final DoubleParam
tol()
Param for the convergence tolerance for iterative algorithms (>= 0).toString()
uid()
An immutable unique ID for the object and its derivatives.Param for weight column name.write()
Returns aGeneralMLWriter
instance for this ML instance.Methods inherited from class org.apache.spark.ml.PredictionModel
featuresCol, labelCol, predictionCol, setFeaturesCol, setPredictionCol, transform, transformSchema
Methods inherited from class org.apache.spark.ml.Transformer
transform, transform, transform
Methods inherited from class org.apache.spark.ml.PipelineStage
params
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.spark.ml.param.shared.HasAggregationDepth
getAggregationDepth
Methods inherited from interface org.apache.spark.ml.param.shared.HasElasticNetParam
getElasticNetParam
Methods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
featuresCol, getFeaturesCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasFitIntercept
getFitIntercept
Methods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelCol, labelCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasMaxBlockSizeInMB
getMaxBlockSizeInMB
Methods inherited from interface org.apache.spark.ml.param.shared.HasMaxIter
getMaxIter
Methods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol, predictionCol
Methods inherited from interface org.apache.spark.ml.param.shared.HasRegParam
getRegParam
Methods inherited from interface org.apache.spark.ml.param.shared.HasStandardization
getStandardization
Methods inherited from interface org.apache.spark.ml.util.HasTrainingSummary
hasSummary, setSummary
Methods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightCol
Methods inherited from interface org.apache.spark.ml.regression.LinearRegressionParams
getEpsilon, validateAndTransformSchema
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
Methods inherited from interface org.apache.spark.ml.util.MLWritable
save
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn
-
Method Details
-
read
-
load
-
solver
Description copied from interface:LinearRegressionParams
The solver algorithm for optimization. Supported options: "l-bfgs", "normal" and "auto". Default: "auto"- Specified by:
solver
in interfaceHasSolver
- Specified by:
solver
in interfaceLinearRegressionParams
- Returns:
- (undocumented)
-
loss
Description copied from interface:LinearRegressionParams
The loss function to be optimized. Supported options: "squaredError" and "huber". Default: "squaredError"- Specified by:
loss
in interfaceHasLoss
- Specified by:
loss
in interfaceLinearRegressionParams
- Returns:
- (undocumented)
-
epsilon
Description copied from interface:LinearRegressionParams
The shape parameter to control the amount of robustness. Must be > 1.0. At larger values of epsilon, the huber criterion becomes more similar to least squares regression; for small values of epsilon, the criterion is more similar to L1 regression. Default is 1.35 to get as much robustness as possible while retaining 95% statistical efficiency for normally distributed data. It matches sklearn HuberRegressor and is "M" from A robust hybrid of lasso and ridge regression. Only valid when "loss" is "huber".- Specified by:
epsilon
in interfaceLinearRegressionParams
- Returns:
- (undocumented)
-
maxBlockSizeInMB
Description copied from interface:HasMaxBlockSizeInMB
Param for Maximum memory in MB for stacking input data into blocks. Data is stacked within partitions. If more than remaining data size in a partition then it is adjusted to the data size. Default 0.0 represents choosing optimal value, depends on specific algorithm. Must be >= 0..- Specified by:
maxBlockSizeInMB
in interfaceHasMaxBlockSizeInMB
- Returns:
- (undocumented)
-
aggregationDepth
Description copied from interface:HasAggregationDepth
Param for suggested depth for treeAggregate (>= 2).- Specified by:
aggregationDepth
in interfaceHasAggregationDepth
- Returns:
- (undocumented)
-
weightCol
Description copied from interface:HasWeightCol
Param for weight column name. If this is not set or empty, we treat all instance weights as 1.0.- Specified by:
weightCol
in interfaceHasWeightCol
- Returns:
- (undocumented)
-
standardization
Description copied from interface:HasStandardization
Param for whether to standardize the training features before fitting the model.- Specified by:
standardization
in interfaceHasStandardization
- Returns:
- (undocumented)
-
fitIntercept
Description copied from interface:HasFitIntercept
Param for whether to fit an intercept term.- Specified by:
fitIntercept
in interfaceHasFitIntercept
- Returns:
- (undocumented)
-
tol
Description copied from interface:HasTol
Param for the convergence tolerance for iterative algorithms (>= 0). -
maxIter
Description copied from interface:HasMaxIter
Param for maximum number of iterations (>= 0).- Specified by:
maxIter
in interfaceHasMaxIter
- Returns:
- (undocumented)
-
elasticNetParam
Description copied from interface:HasElasticNetParam
Param for the ElasticNet mixing parameter, in range [0, 1]. For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 penalty.- Specified by:
elasticNetParam
in interfaceHasElasticNetParam
- Returns:
- (undocumented)
-
regParam
Description copied from interface:HasRegParam
Param for regularization parameter (>= 0).- Specified by:
regParam
in interfaceHasRegParam
- Returns:
- (undocumented)
-
uid
Description copied from interface:Identifiable
An immutable unique ID for the object and its derivatives.- Specified by:
uid
in interfaceIdentifiable
- Returns:
- (undocumented)
-
coefficients
-
intercept
public double intercept() -
scale
public double scale() -
numFeatures
public int numFeatures()Description copied from class:PredictionModel
Returns the number of features the model was trained on. If unknown, returns -1- Overrides:
numFeatures
in classPredictionModel<Vector,
LinearRegressionModel>
-
summary
Gets summary (e.g. residuals, mse, r-squared ) of model on training set. An exception is thrown ifhasSummary
is false.- Specified by:
summary
in interfaceHasTrainingSummary<LinearRegressionTrainingSummary>
- Returns:
- (undocumented)
-
evaluate
Evaluates the model on a test dataset.- Parameters:
dataset
- Test dataset to evaluate model on.- Returns:
- (undocumented)
-
predict
Description copied from class:PredictionModel
Predict label for the given features. This method is used to implementtransform()
and outputPredictionModel.predictionCol()
.- Specified by:
predict
in classPredictionModel<Vector,
LinearRegressionModel> - Parameters:
features
- (undocumented)- Returns:
- (undocumented)
-
copy
Description copied from interface:Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy()
.- Specified by:
copy
in interfaceParams
- Specified by:
copy
in classModel<LinearRegressionModel>
- Parameters:
extra
- (undocumented)- Returns:
- (undocumented)
-
write
Returns aGeneralMLWriter
instance for this ML instance.For
LinearRegressionModel
, this does NOT currently save the trainingsummary()
. An option to savesummary()
may be added in the future.This also does not save the
Model.parent()
currently.- Specified by:
write
in interfaceGeneralMLWritable
- Specified by:
write
in interfaceMLWritable
- Returns:
- (undocumented)
-
toString
- Specified by:
toString
in interfaceIdentifiable
- Overrides:
toString
in classObject
-