class StreamingLogisticRegressionWithSGD extends StreamingLinearAlgorithm[LogisticRegressionModel, LogisticRegressionWithSGD] with Serializable
Train or predict a logistic regression model on streaming data. Training uses
Stochastic Gradient Descent to update the model based on each new batch of
incoming data from a DStream (see LogisticRegressionWithSGD
for model equation)
Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.
Use a builder pattern to construct a streaming logistic regression analysis in an application, like:
val model = new StreamingLogisticRegressionWithSGD() .setStepSize(0.5) .setNumIterations(10) .setInitialWeights(Vectors.dense(...)) .trainOn(DStream)
- Annotations
- @Since( "1.3.0" )
- Source
- StreamingLogisticRegressionWithSGD.scala
- Alphabetic
- By Inheritance
- StreamingLogisticRegressionWithSGD
- Serializable
- Serializable
- StreamingLinearAlgorithm
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
StreamingLogisticRegressionWithSGD()
Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}.
Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. Initial weights must be set before using trainOn or predictOn (see
StreamingLinearAlgorithm
)- Annotations
- @Since( "1.3.0" )
Value Members
-
def
latestModel(): LogisticRegressionModel
Return the latest model.
Return the latest model.
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
predictOn(data: JavaDStream[Vector]): JavaDStream[Double]
Java-friendly version of
predictOn
.Java-friendly version of
predictOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
predictOn(data: DStream[Vector]): DStream[Double]
Use the model to make predictions on batches of data from a DStream
Use the model to make predictions on batches of data from a DStream
- data
DStream containing feature vectors
- returns
DStream containing predictions
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
predictOnValues[K](data: JavaPairDStream[K, Vector]): JavaPairDStream[K, Double]
Java-friendly version of
predictOnValues
.Java-friendly version of
predictOnValues
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
predictOnValues[K](data: DStream[(K, Vector)])(implicit arg0: ClassTag[K]): DStream[(K, Double)]
Use the model to make predictions on the values of a DStream and carry over its keys.
Use the model to make predictions on the values of a DStream and carry over its keys.
- K
key type
- data
DStream containing feature vectors
- returns
DStream containing the input keys and the predictions as values
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
setInitialWeights(initialWeights: Vector): StreamingLogisticRegressionWithSGD.this.type
Set the initial weights.
Set the initial weights. Default: [0.0, 0.0].
- Annotations
- @Since( "1.3.0" )
-
def
setMiniBatchFraction(miniBatchFraction: Double): StreamingLogisticRegressionWithSGD.this.type
Set the fraction of each batch to use for updates.
Set the fraction of each batch to use for updates. Default: 1.0.
- Annotations
- @Since( "1.3.0" )
-
def
setNumIterations(numIterations: Int): StreamingLogisticRegressionWithSGD.this.type
Set the number of iterations of gradient descent to run per update.
Set the number of iterations of gradient descent to run per update. Default: 50.
- Annotations
- @Since( "1.3.0" )
-
def
setRegParam(regParam: Double): StreamingLogisticRegressionWithSGD.this.type
Set the regularization parameter.
Set the regularization parameter. Default: 0.0.
- Annotations
- @Since( "1.3.0" )
-
def
setStepSize(stepSize: Double): StreamingLogisticRegressionWithSGD.this.type
Set the step size for gradient descent.
Set the step size for gradient descent. Default: 0.1.
- Annotations
- @Since( "1.3.0" )
-
def
trainOn(data: JavaDStream[LabeledPoint]): Unit
Java-friendly version of
trainOn
.Java-friendly version of
trainOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
trainOn(data: DStream[LabeledPoint]): Unit
Update the model by training on batches of data from a DStream.
Update the model by training on batches of data from a DStream. This operation registers a DStream for training the model, and updates the model based on every subsequent batch of data from the stream.
- data
DStream containing labeled data
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )