public class GradientDescent extends Object implements Optimizer, org.apache.spark.internal.Logging
Modifier and Type | Method and Description |
---|---|
Vector |
optimize(RDD<scala.Tuple2<Object,Vector>> data,
Vector initialWeights)
Runs gradient descent on the given training data.
|
scala.Tuple2<Vector,double[]> |
optimizeWithLossReturned(RDD<scala.Tuple2<Object,Vector>> data,
Vector initialWeights)
Runs gradient descent on the given training data.
|
static void |
org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) |
static org.slf4j.Logger |
org$apache$spark$internal$Logging$$log_() |
static scala.Tuple2<Vector,double[]> |
runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data,
Gradient gradient,
Updater updater,
double stepSize,
int numIterations,
double regParam,
double miniBatchFraction,
Vector initialWeights)
Alias of
runMiniBatchSGD with convergenceTol set to default value of 0.001. |
static scala.Tuple2<Vector,double[]> |
runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data,
Gradient gradient,
Updater updater,
double stepSize,
int numIterations,
double regParam,
double miniBatchFraction,
Vector initialWeights,
double convergenceTol)
Run stochastic gradient descent (SGD) in parallel using mini batches.
|
GradientDescent |
setConvergenceTol(double tolerance)
Set the convergence tolerance.
|
GradientDescent |
setGradient(Gradient gradient)
Set the gradient function (of the loss function of one single data example)
to be used for SGD.
|
GradientDescent |
setMiniBatchFraction(double fraction)
Set fraction of data to be used for each SGD iteration.
|
GradientDescent |
setNumIterations(int iters)
Set the number of iterations for SGD.
|
GradientDescent |
setRegParam(double regParam)
Set the regularization parameter.
|
GradientDescent |
setStepSize(double step)
Set the initial step size of SGD for the first step.
|
GradientDescent |
setUpdater(Updater updater)
Set the updater function to actually perform a gradient step in a given direction.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public static scala.Tuple2<Vector,double[]> runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data, Gradient gradient, Updater updater, double stepSize, int numIterations, double regParam, double miniBatchFraction, Vector initialWeights, double convergenceTol)
data
- Input data for SGD. RDD of the set of data examples, each of
the form (label, [feature values]).gradient
- Gradient object (used to compute the gradient of the loss function of
one single data example)updater
- Updater function to actually perform a gradient step in a given direction.stepSize
- initial step size for the first stepnumIterations
- number of iterations that SGD should be run.regParam
- regularization parameterminiBatchFraction
- fraction of the input data set that should be used for
one iteration of SGD. Default value 1.0.convergenceTol
- Minibatch iteration will end before numIterations if the relative
difference between the current weight and the previous weight is less
than this value. In measuring convergence, L2 norm is calculated.
Default value 0.001. Must be between 0.0 and 1.0 inclusively.initialWeights
- (undocumented)public static scala.Tuple2<Vector,double[]> runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data, Gradient gradient, Updater updater, double stepSize, int numIterations, double regParam, double miniBatchFraction, Vector initialWeights)
runMiniBatchSGD
with convergenceTol set to default value of 0.001.data
- (undocumented)gradient
- (undocumented)updater
- (undocumented)stepSize
- (undocumented)numIterations
- (undocumented)regParam
- (undocumented)miniBatchFraction
- (undocumented)initialWeights
- (undocumented)public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
public static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1)
public GradientDescent setStepSize(double step)
step
- (undocumented)public GradientDescent setMiniBatchFraction(double fraction)
fraction
- (undocumented)public GradientDescent setNumIterations(int iters)
iters
- (undocumented)public GradientDescent setRegParam(double regParam)
regParam
- (undocumented)public GradientDescent setConvergenceTol(double tolerance)
- If the norm of the new solution vector is greater than 1, the diff of solution vectors is compared to relative tolerance which means normalizing by the norm of the new solution vector. - If the norm of the new solution vector is less than or equal to 1, the diff of solution vectors is compared to absolute tolerance which is not normalizing.
Must be between 0.0 and 1.0 inclusively.
tolerance
- (undocumented)public GradientDescent setGradient(Gradient gradient)
gradient
- (undocumented)public GradientDescent setUpdater(Updater updater)
updater
- (undocumented)public Vector optimize(RDD<scala.Tuple2<Object,Vector>> data, Vector initialWeights)
public scala.Tuple2<Vector,double[]> optimizeWithLossReturned(RDD<scala.Tuple2<Object,Vector>> data, Vector initialWeights)
data
- training datainitialWeights
- initial weights