GradientDescent (Spark 1.4.1 JavaDoc)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.spark.mllib.optimization
Class GradientDescent

Object
  org.apache.spark.mllib.optimization.GradientDescent

All Implemented Interfaces:: java.io.Serializable, Logging, Optimizer

public class GradientDescent
extends Object
implements Optimizer, Logging
extends Object
implements Optimizer, Logging

Class used to solve an optimization problem using Gradient Descent. param: gradient Gradient function to be used. param: updater Updater to be used to update weights after every iteration.

See Also:: Serialized Form

Method Summary
`Vector`	`optimize(RDD<scala.Tuple2<Object,Vector>> data, Vector initialWeights)` :: DeveloperApi :: Runs gradient descent on the given training data.
`static scala.Tuple2<Vector,double[]>`	`runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data, Gradient gradient, Updater updater, double stepSize, int numIterations, double regParam, double miniBatchFraction, Vector initialWeights)` Run stochastic gradient descent (SGD) in parallel using mini batches.
`GradientDescent`	`setGradient(Gradient gradient)` Set the gradient function (of the loss function of one single data example) to be used for SGD.
`GradientDescent`	`setMiniBatchFraction(double fraction)` :: Experimental :: Set fraction of data to be used for each SGD iteration.
`GradientDescent`	`setNumIterations(int iters)` Set the number of iterations for SGD.
`GradientDescent`	`setRegParam(double regParam)` Set the regularization parameter.
`GradientDescent`	`setStepSize(double step)` Set the initial step size of SGD for the first step.
`GradientDescent`	`setUpdater(Updater updater)` Set the updater function to actually perform a gradient step in a given direction.

Methods inherited from class Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface org.apache.spark.Logging
`initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning`

Method Detail

runMiniBatchSGD

public static scala.Tuple2<Vector,double[]> runMiniBatchSGD(RDD<scala.Tuple2<Object,Vector>> data,
                                                            Gradient gradient,
                                                            Updater updater,
                                                            double stepSize,
                                                            int numIterations,
                                                            double regParam,
                                                            double miniBatchFraction,
                                                            Vector initialWeights)

Run stochastic gradient descent (SGD) in parallel using mini batches. In each iteration, we sample a subset (fraction miniBatchFraction) of the total data in order to compute a gradient estimate. Sampling, and averaging the subgradients over this subset is performed using one standard spark map-reduce in each iteration.

Parameters:: data - - Input data for SGD. RDD of the set of data examples, each of the form (label, [feature values]).; gradient - - Gradient object (used to compute the gradient of the loss function of one single data example); updater - - Updater function to actually perform a gradient step in a given direction.; stepSize - - initial step size for the first step; numIterations - - number of iterations that SGD should be run.; regParam - - regularization parameter; miniBatchFraction - - fraction of the input data set that should be used for one iteration of SGD. Default value 1.0.; initialWeights - (undocumented)
Returns:: A tuple containing two elements. The first element is a column matrix containing weights for every feature, and the second element is an array containing the stochastic loss computed for every iteration.

setStepSize

public GradientDescent setStepSize(double step)

Set the initial step size of SGD for the first step. Default 1.0. In subsequent steps, the step size will decrease with stepSize/sqrt(t)

Parameters:: step - (undocumented)
Returns:: (undocumented)

setMiniBatchFraction

public GradientDescent setMiniBatchFraction(double fraction)

:: Experimental :: Set fraction of data to be used for each SGD iteration. Default 1.0 (corresponding to deterministic/classical gradient descent)

Parameters:: fraction - (undocumented)
Returns:: (undocumented)

setNumIterations

public GradientDescent setNumIterations(int iters)

Set the number of iterations for SGD. Default 100.

Parameters:: iters - (undocumented)
Returns:: (undocumented)

setRegParam

public GradientDescent setRegParam(double regParam)

Set the regularization parameter. Default 0.0.

Parameters:: regParam - (undocumented)
Returns:: (undocumented)

setGradient

public GradientDescent setGradient(Gradient gradient)

Set the gradient function (of the loss function of one single data example) to be used for SGD.

Parameters:: gradient - (undocumented)
Returns:: (undocumented)

setUpdater

public GradientDescent setUpdater(Updater updater)

Set the updater function to actually perform a gradient step in a given direction. The updater is responsible to perform the update from the regularization term as well, and therefore determines what kind or regularization is used, if any.

Parameters:: updater - (undocumented)
Returns:: (undocumented)

optimize

public Vector optimize(RDD<scala.Tuple2<Object,Vector>> data,
                       Vector initialWeights)

:: DeveloperApi :: Runs gradient descent on the given training data.

Specified by:: optimize in interface Optimizer

Parameters:: data - training data; initialWeights - initial weights
Returns:: solution vector