org.apache.spark.mllib.classification
Class StreamingLogisticRegressionWithSGD

Object
  extended by org.apache.spark.mllib.regression.StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>
      extended by org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
All Implemented Interfaces:
java.io.Serializable, Logging

public class StreamingLogisticRegressionWithSGD
extends StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>
implements scala.Serializable

:: Experimental :: Train or predict a logistic regression model on streaming data. Training uses Stochastic Gradient Descent to update the model based on each new batch of incoming data from a DStream (see LogisticRegressionWithSGD for model equation)

Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.

Use a builder pattern to construct a streaming logistic regression analysis in an application, like:


  val model = new StreamingLogisticRegressionWithSGD()
    .setStepSize(0.5)
    .setNumIterations(10)
    .setInitialWeights(Vectors.dense(...))
    .trainOn(DStream)
 

See Also:
Serialized Form

Constructor Summary
StreamingLogisticRegressionWithSGD()
          Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
 
Method Summary
 StreamingLogisticRegressionWithSGD setInitialWeights(Vector initialWeights)
          Set the initial weights.
 StreamingLogisticRegressionWithSGD setMiniBatchFraction(double miniBatchFraction)
          Set the fraction of each batch to use for updates.
 StreamingLogisticRegressionWithSGD setNumIterations(int numIterations)
          Set the number of iterations of gradient descent to run per update.
 StreamingLogisticRegressionWithSGD setRegParam(double regParam)
          Set the regularization parameter.
 StreamingLogisticRegressionWithSGD setStepSize(double stepSize)
          Set the step size for gradient descent.
 
Methods inherited from class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
latestModel, predictOn, predictOn, predictOnValues, predictOnValues, trainOn, trainOn
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 

Constructor Detail

StreamingLogisticRegressionWithSGD

public StreamingLogisticRegressionWithSGD()
Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. Initial weights must be set before using trainOn or predictOn (see StreamingLinearAlgorithm)

Method Detail

setStepSize

public StreamingLogisticRegressionWithSGD setStepSize(double stepSize)
Set the step size for gradient descent. Default: 0.1.


setNumIterations

public StreamingLogisticRegressionWithSGD setNumIterations(int numIterations)
Set the number of iterations of gradient descent to run per update. Default: 50.


setMiniBatchFraction

public StreamingLogisticRegressionWithSGD setMiniBatchFraction(double miniBatchFraction)
Set the fraction of each batch to use for updates. Default: 1.0.


setRegParam

public StreamingLogisticRegressionWithSGD setRegParam(double regParam)
Set the regularization parameter. Default: 0.0.


setInitialWeights

public StreamingLogisticRegressionWithSGD setInitialWeights(Vector initialWeights)
Set the initial weights. Default: [0.0, 0.0].