org.apache.spark.mllib.optimization
Class LBFGS

Object
  extended by org.apache.spark.mllib.optimization.LBFGS
All Implemented Interfaces:
java.io.Serializable, Logging, Optimizer

public class LBFGS
extends Object
implements Optimizer, Logging

:: DeveloperApi :: Class used to solve an optimization problem using Limited-memory BFGS. Reference: http://en.wikipedia.org/wiki/Limited-memory_BFGS param: gradient Gradient function to be used. param: updater Updater to be used to update weights after every iteration.

See Also:
Serialized Form

Constructor Summary
LBFGS(Gradient gradient, Updater updater)
           
 
Method Summary
 Vector optimize(RDD<scala.Tuple2<Object,Vector>> data, Vector initialWeights)
          Solve the provided convex optimization problem.
static scala.Tuple2<Vector,double[]> runLBFGS(RDD<scala.Tuple2<Object,Vector>> data, Gradient gradient, Updater updater, int numCorrections, double convergenceTol, int maxNumIterations, double regParam, Vector initialWeights)
          Run Limited-memory BFGS (L-BFGS) in parallel.
 LBFGS setConvergenceTol(double tolerance)
          Set the convergence tolerance of iterations for L-BFGS.
 LBFGS setGradient(Gradient gradient)
          Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.
 LBFGS setMaxNumIterations(int iters)
          Deprecated. use setNumIterations(int) instead
 LBFGS setNumCorrections(int corrections)
          Set the number of corrections used in the LBFGS update.
 LBFGS setNumIterations(int iters)
          Set the maximal number of iterations for L-BFGS.
 LBFGS setRegParam(double regParam)
          Set the regularization parameter.
 LBFGS setUpdater(Updater updater)
          Set the updater function to actually perform a gradient step in a given direction.
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

LBFGS

public LBFGS(Gradient gradient,
             Updater updater)
Method Detail

runLBFGS

public static scala.Tuple2<Vector,double[]> runLBFGS(RDD<scala.Tuple2<Object,Vector>> data,
                                                     Gradient gradient,
                                                     Updater updater,
                                                     int numCorrections,
                                                     double convergenceTol,
                                                     int maxNumIterations,
                                                     double regParam,
                                                     Vector initialWeights)
Run Limited-memory BFGS (L-BFGS) in parallel. Averaging the subgradients over different partitions is performed using one standard spark map-reduce in each iteration.

Parameters:
data - - Input data for L-BFGS. RDD of the set of data examples, each of the form (label, [feature values]).
gradient - - Gradient object (used to compute the gradient of the loss function of one single data example)
updater - - Updater function to actually perform a gradient step in a given direction.
numCorrections - - The number of corrections used in the L-BFGS update.
convergenceTol - - The convergence tolerance of iterations for L-BFGS which is must be nonnegative. Lower values are less tolerant and therefore generally cause more iterations to be run.
maxNumIterations - - Maximal number of iterations that L-BFGS can be run.
regParam - - Regularization parameter

initialWeights - (undocumented)
Returns:
A tuple containing two elements. The first element is a column matrix containing weights for every feature, and the second element is an array containing the loss computed for every iteration.

setNumCorrections

public LBFGS setNumCorrections(int corrections)
Set the number of corrections used in the LBFGS update. Default 10. Values of numCorrections less than 3 are not recommended; large values of numCorrections will result in excessive computing time. 3 < numCorrections < 10 is recommended. Restriction: numCorrections > 0

Parameters:
corrections - (undocumented)
Returns:
(undocumented)

setConvergenceTol

public LBFGS setConvergenceTol(double tolerance)
Set the convergence tolerance of iterations for L-BFGS. Default 1E-4. Smaller value will lead to higher accuracy with the cost of more iterations. This value must be nonnegative. Lower convergence values are less tolerant and therefore generally cause more iterations to be run.

Parameters:
tolerance - (undocumented)
Returns:
(undocumented)

setMaxNumIterations

public LBFGS setMaxNumIterations(int iters)
Deprecated. use setNumIterations(int) instead

Set the maximal number of iterations for L-BFGS. Default 100.

Parameters:
iters - (undocumented)
Returns:
(undocumented)

setNumIterations

public LBFGS setNumIterations(int iters)
Set the maximal number of iterations for L-BFGS. Default 100.

Parameters:
iters - (undocumented)
Returns:
(undocumented)

setRegParam

public LBFGS setRegParam(double regParam)
Set the regularization parameter. Default 0.0.

Parameters:
regParam - (undocumented)
Returns:
(undocumented)

setGradient

public LBFGS setGradient(Gradient gradient)
Set the gradient function (of the loss function of one single data example) to be used for L-BFGS.

Parameters:
gradient - (undocumented)
Returns:
(undocumented)

setUpdater

public LBFGS setUpdater(Updater updater)
Set the updater function to actually perform a gradient step in a given direction. The updater is responsible to perform the update from the regularization term as well, and therefore determines what kind or regularization is used, if any.

Parameters:
updater - (undocumented)
Returns:
(undocumented)

optimize

public Vector optimize(RDD<scala.Tuple2<Object,Vector>> data,
                       Vector initialWeights)
Description copied from interface: Optimizer
Solve the provided convex optimization problem.

Specified by:
optimize in interface Optimizer
Parameters:
data - (undocumented)
initialWeights - (undocumented)
Returns:
(undocumented)