Class StreamingLogisticRegressionWithSGD

Object
org.apache.spark.mllib.regression.StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>
org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging

public class StreamingLogisticRegressionWithSGD extends StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD> implements Serializable
Train or predict a logistic regression model on streaming data. Training uses Stochastic Gradient Descent to update the model based on each new batch of incoming data from a DStream (see LogisticRegressionWithSGD for model equation)

Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.

Use a builder pattern to construct a streaming logistic regression analysis in an application, like:


  val model = new StreamingLogisticRegressionWithSGD()
    .setStepSize(0.5)
    .setNumIterations(10)
    .setInitialWeights(Vectors.dense(...))
    .trainOn(DStream)
 
See Also:
  • Constructor Details

    • StreamingLogisticRegressionWithSGD

      public StreamingLogisticRegressionWithSGD()
      Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. Initial weights must be set before using trainOn or predictOn (see StreamingLinearAlgorithm)
  • Method Details