Package org.apache.spark.ml.regression
Class FMRegressor
Object
org.apache.spark.ml.PipelineStage
org.apache.spark.ml.Estimator<M>
org.apache.spark.ml.Predictor<FeaturesType,Learner,M>
org.apache.spark.ml.regression.Regressor<Vector,FMRegressor,FMRegressionModel>
org.apache.spark.ml.regression.FMRegressor
- All Implemented Interfaces:
Serializable,org.apache.spark.internal.Logging,Params,HasFeaturesCol,HasFitIntercept,HasLabelCol,HasMaxIter,HasPredictionCol,HasRegParam,HasSeed,HasSolver,HasStepSize,HasTol,HasWeightCol,PredictorParams,FactorizationMachines,FactorizationMachinesParams,FMRegressorParams,DefaultParamsWritable,Identifiable,MLWritable
public class FMRegressor
extends Regressor<Vector,FMRegressor,FMRegressionModel>
implements FactorizationMachines, FMRegressorParams, DefaultParamsWritable, org.apache.spark.internal.Logging
Factorization Machines learning algorithm for regression.
It supports normal gradient descent and AdamW solver.
The implementation is based on: S. Rendle. "Factorization machines" 2010.
FM is able to estimate interactions even in problems with huge sparsity (like advertising and recommendation system). FM formula is:
$$ \begin{align} y = w_0 + \sum\limits^n_{i-1} w_i x_i + \sum\limits^n_{i=1} \sum\limits^n_{j=i+1} \langle v_i, v_j \rangle x_i x_j \end{align} $$First two terms denote global bias and linear term (as same as linear regression), and last term denotes pairwise interactions term. v_i describes the i-th variable with k factors.
FM regression model uses MSE loss which can be solved by gradient descent method, and regularization terms like L2 are usually added to the loss function to prevent overfitting.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionCreates a copy of this instance with the same UID and some extra params.final IntParamParam for dimensionality of the factors (>= 0)final BooleanParamParam for whether to fit an intercept term.final BooleanParamParam for whether to fit linear term (aka 1-way term)final DoubleParaminitStd()Param for standard deviation of initial coefficientsstatic FMRegressorfinal IntParammaxIter()Param for maximum number of iterations (>= 0).final DoubleParamParam for mini-batch fraction, must be in range (0, 1]static MLReader<T>read()final DoubleParamregParam()Param for regularization parameter (>= 0).final LongParamseed()Param for random seed.setFactorSize(int value) Set the dimensionality of the factors.setFitIntercept(boolean value) Set whether to fit intercept term.setFitLinear(boolean value) Set whether to fit linear term.setInitStd(double value) Set the standard deviation of initial coefficients.setMaxIter(int value) Set the maximum number of iterations.setMiniBatchFraction(double value) Set the mini-batch fraction parameter.setRegParam(double value) Set the L2 regularization parameter.setSeed(long value) Set the random seed for weight initialization.Set the solver algorithm used for optimization.setStepSize(double value) Set the initial step size for the first step (like learning rate).setTol(double value) Set the convergence tolerance of iterations.solver()The solver algorithm for optimization.stepSize()Param for Step size to be used for each iteration of optimization (> 0).final DoubleParamtol()Param for the convergence tolerance for iterative algorithms (>= 0).uid()An immutable unique ID for the object and its derivatives.Param for weight column name.Methods inherited from class org.apache.spark.ml.Predictor
featuresCol, fit, labelCol, predictionCol, setFeaturesCol, setLabelCol, setPredictionCol, transformSchemaMethods inherited from class org.apache.spark.ml.PipelineStage
paramsMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.spark.ml.util.DefaultParamsWritable
writeMethods inherited from interface org.apache.spark.ml.regression.FactorizationMachines
initCoefficients, trainImplMethods inherited from interface org.apache.spark.ml.regression.FactorizationMachinesParams
getFactorSize, getFitLinear, getInitStd, getMiniBatchFractionMethods inherited from interface org.apache.spark.ml.param.shared.HasFeaturesCol
featuresCol, getFeaturesColMethods inherited from interface org.apache.spark.ml.param.shared.HasFitIntercept
getFitInterceptMethods inherited from interface org.apache.spark.ml.param.shared.HasLabelCol
getLabelCol, labelColMethods inherited from interface org.apache.spark.ml.param.shared.HasMaxIter
getMaxIterMethods inherited from interface org.apache.spark.ml.param.shared.HasPredictionCol
getPredictionCol, predictionColMethods inherited from interface org.apache.spark.ml.param.shared.HasRegParam
getRegParamMethods inherited from interface org.apache.spark.ml.param.shared.HasStepSize
getStepSizeMethods inherited from interface org.apache.spark.ml.param.shared.HasWeightCol
getWeightColMethods inherited from interface org.apache.spark.ml.util.Identifiable
toStringMethods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContextMethods inherited from interface org.apache.spark.ml.util.MLWritable
saveMethods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, onParamChange, paramMap, params, set, set, set, setDefault, setDefault, shouldOwnMethods inherited from interface org.apache.spark.ml.PredictorParams
validateAndTransformSchema
-
Constructor Details
-
FMRegressor
-
FMRegressor
public FMRegressor()
-
-
Method Details
-
load
-
read
-
factorSize
Description copied from interface:FactorizationMachinesParamsParam for dimensionality of the factors (>= 0)- Specified by:
factorSizein interfaceFactorizationMachinesParams- Returns:
- (undocumented)
-
fitLinear
Description copied from interface:FactorizationMachinesParamsParam for whether to fit linear term (aka 1-way term)- Specified by:
fitLinearin interfaceFactorizationMachinesParams- Returns:
- (undocumented)
-
miniBatchFraction
Description copied from interface:FactorizationMachinesParamsParam for mini-batch fraction, must be in range (0, 1]- Specified by:
miniBatchFractionin interfaceFactorizationMachinesParams- Returns:
- (undocumented)
-
initStd
Description copied from interface:FactorizationMachinesParamsParam for standard deviation of initial coefficients- Specified by:
initStdin interfaceFactorizationMachinesParams- Returns:
- (undocumented)
-
solver
Description copied from interface:FactorizationMachinesParamsThe solver algorithm for optimization. Supported options: "gd", "adamW". Default: "adamW"- Specified by:
solverin interfaceFactorizationMachinesParams- Specified by:
solverin interfaceHasSolver- Returns:
- (undocumented)
-
weightCol
Description copied from interface:HasWeightColParam for weight column name. If this is not set or empty, we treat all instance weights as 1.0.- Specified by:
weightColin interfaceHasWeightCol- Returns:
- (undocumented)
-
regParam
Description copied from interface:HasRegParamParam for regularization parameter (>= 0).- Specified by:
regParamin interfaceHasRegParam- Returns:
- (undocumented)
-
fitIntercept
Description copied from interface:HasFitInterceptParam for whether to fit an intercept term.- Specified by:
fitInterceptin interfaceHasFitIntercept- Returns:
- (undocumented)
-
seed
Description copied from interface:HasSeedParam for random seed. -
tol
Description copied from interface:HasTolParam for the convergence tolerance for iterative algorithms (>= 0). -
stepSize
Description copied from interface:HasStepSizeParam for Step size to be used for each iteration of optimization (> 0).- Specified by:
stepSizein interfaceHasStepSize- Returns:
- (undocumented)
-
maxIter
Description copied from interface:HasMaxIterParam for maximum number of iterations (>= 0).- Specified by:
maxIterin interfaceHasMaxIter- Returns:
- (undocumented)
-
uid
Description copied from interface:IdentifiableAn immutable unique ID for the object and its derivatives.- Specified by:
uidin interfaceIdentifiable- Returns:
- (undocumented)
-
setFactorSize
Set the dimensionality of the factors. Default is 8.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setFitIntercept
Set whether to fit intercept term. Default is true.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setFitLinear
Set whether to fit linear term. Default is true.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setRegParam
Set the L2 regularization parameter. Default is 0.0.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setMiniBatchFraction
Set the mini-batch fraction parameter. Default is 1.0.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setInitStd
Set the standard deviation of initial coefficients. Default is 0.01.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setMaxIter
Set the maximum number of iterations. Default is 100.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setStepSize
Set the initial step size for the first step (like learning rate). Default is 1.0.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setTol
Set the convergence tolerance of iterations. Default is 1E-6.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setSolver
Set the solver algorithm used for optimization. Supported options: "gd", "adamW". Default: "adamW"- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
setSeed
Set the random seed for weight initialization.- Parameters:
value- (undocumented)- Returns:
- (undocumented)
-
copy
Description copied from interface:ParamsCreates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. SeedefaultCopy().- Specified by:
copyin interfaceParams- Specified by:
copyin classPredictor<Vector,FMRegressor, FMRegressionModel> - Parameters:
extra- (undocumented)- Returns:
- (undocumented)
-