org.apache.spark.mllib.regression.StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>

org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging

public class StreamingLogisticRegressionWithSGD extends StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD> implements Serializable

Train or predict a logistic regression model on streaming data. Training uses Stochastic Gradient Descent to update the model based on each new batch of incoming data from a DStream (see LogisticRegressionWithSGD for model equation)

Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.

Use a builder pattern to construct a streaming logistic regression analysis in an application, like:


  val model = new StreamingLogisticRegressionWithSGD()
    .setStepSize(0.5)
    .setNumIterations(10)
    .setInitialWeights(Vectors.dense(...))
    .trainOn(DStream)

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

StreamingLogisticRegressionWithSGD()

Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
Method Summary

Modifier and Type

Method

Description

StreamingLogisticRegressionWithSGD

setInitialWeights(Vector initialWeights)

Set the initial weights.

StreamingLogisticRegressionWithSGD

setMiniBatchFraction(double miniBatchFraction)

Set the fraction of each batch to use for updates.

StreamingLogisticRegressionWithSGD

setNumIterations(int numIterations)

Set the number of iterations of gradient descent to run per update.

StreamingLogisticRegressionWithSGD

setRegParam(double regParam)

Set the regularization parameter.

StreamingLogisticRegressionWithSGD

setStepSize(double stepSize)

Set the step size for gradient descent.

Methods inherited from class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
latestModel, predictOn, predictOn, predictOnValues, predictOnValues, trainOn, trainOn

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- StreamingLogisticRegressionWithSGD
  
  public StreamingLogisticRegressionWithSGD()
  
  Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. Initial weights must be set before using trainOn or predictOn (see StreamingLinearAlgorithm)
Method Details
- setInitialWeights
  
  public StreamingLogisticRegressionWithSGD setInitialWeights(Vector initialWeights)
  
  Set the initial weights. Default: [0.0, 0.0].
- setMiniBatchFraction
  
  public StreamingLogisticRegressionWithSGD setMiniBatchFraction(double miniBatchFraction)
  
  Set the fraction of each batch to use for updates. Default: 1.0.
- setNumIterations
  
  public StreamingLogisticRegressionWithSGD setNumIterations(int numIterations)
  
  Set the number of iterations of gradient descent to run per update. Default: 50.
- setRegParam
  
  public StreamingLogisticRegressionWithSGD setRegParam(double regParam)
  
  Set the regularization parameter. Default: 0.0.
- setStepSize
  
  public StreamingLogisticRegressionWithSGD setStepSize(double stepSize)
  
  Set the step size for gradient descent. Default: 0.1.

Class StreamingLogisticRegressionWithSGD

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.mllib.regression.StreamingLinearAlgorithm

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

StreamingLogisticRegressionWithSGD

Method Details

setInitialWeights

setMiniBatchFraction

setNumIterations

setRegParam

setStepSize