class StreamingLinearRegressionWithSGD extends StreamingLinearAlgorithm[LinearRegressionModel, LinearRegressionWithSGD] with Serializable
Train or predict a linear regression model on streaming data. Training uses
Stochastic Gradient Descent to update the model based on each new batch of
incoming data from a DStream (see LinearRegressionWithSGD
for model equation)
Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.
Use a builder pattern to construct a streaming linear regression analysis in an application, like:
val model = new StreamingLinearRegressionWithSGD() .setStepSize(0.5) .setNumIterations(10) .setInitialWeights(Vectors.dense(...)) .trainOn(DStream)
- Annotations
- @Since( "1.1.0" )
- Source
- StreamingLinearRegressionWithSGD.scala
- Alphabetic
- By Inheritance
- StreamingLinearRegressionWithSGD
- Serializable
- Serializable
- StreamingLinearAlgorithm
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
StreamingLinearRegressionWithSGD()
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}. Initial weights must be set before using trainOn or predictOn (see
StreamingLinearAlgorithm
)- Annotations
- @Since( "1.1.0" )
Value Members
-
val
algorithm: LinearRegressionWithSGD
The algorithm to use for updating.
The algorithm to use for updating.
- Definition Classes
- StreamingLinearRegressionWithSGD → StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
latestModel(): LinearRegressionModel
Return the latest model.
Return the latest model.
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
predictOn(data: JavaDStream[Vector]): JavaDStream[Double]
Java-friendly version of
predictOn
.Java-friendly version of
predictOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
predictOn(data: DStream[Vector]): DStream[Double]
Use the model to make predictions on batches of data from a DStream
Use the model to make predictions on batches of data from a DStream
- data
DStream containing feature vectors
- returns
DStream containing predictions
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
predictOnValues[K](data: JavaPairDStream[K, Vector]): JavaPairDStream[K, Double]
Java-friendly version of
predictOnValues
.Java-friendly version of
predictOnValues
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
predictOnValues[K](data: DStream[(K, Vector)])(implicit arg0: ClassTag[K]): DStream[(K, Double)]
Use the model to make predictions on the values of a DStream and carry over its keys.
Use the model to make predictions on the values of a DStream and carry over its keys.
- K
key type
- data
DStream containing feature vectors
- returns
DStream containing the input keys and the predictions as values
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )
-
def
setConvergenceTol(tolerance: Double): StreamingLinearRegressionWithSGD.this.type
Set the convergence tolerance.
Set the convergence tolerance. Default: 0.001.
- Annotations
- @Since( "1.5.0" )
-
def
setInitialWeights(initialWeights: Vector): StreamingLinearRegressionWithSGD.this.type
Set the initial weights.
Set the initial weights.
- Annotations
- @Since( "1.1.0" )
-
def
setMiniBatchFraction(miniBatchFraction: Double): StreamingLinearRegressionWithSGD.this.type
Set the fraction of each batch to use for updates.
Set the fraction of each batch to use for updates. Default: 1.0.
- Annotations
- @Since( "1.1.0" )
-
def
setNumIterations(numIterations: Int): StreamingLinearRegressionWithSGD.this.type
Set the number of iterations of gradient descent to run per update.
Set the number of iterations of gradient descent to run per update. Default: 50.
- Annotations
- @Since( "1.1.0" )
-
def
setRegParam(regParam: Double): StreamingLinearRegressionWithSGD.this.type
Set the regularization parameter.
Set the regularization parameter. Default: 0.0.
- Annotations
- @Since( "2.0.0" )
-
def
setStepSize(stepSize: Double): StreamingLinearRegressionWithSGD.this.type
Set the step size for gradient descent.
Set the step size for gradient descent. Default: 0.1.
- Annotations
- @Since( "1.1.0" )
-
def
trainOn(data: JavaDStream[LabeledPoint]): Unit
Java-friendly version of
trainOn
.Java-friendly version of
trainOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.3.0" )
-
def
trainOn(data: DStream[LabeledPoint]): Unit
Update the model by training on batches of data from a DStream.
Update the model by training on batches of data from a DStream. This operation registers a DStream for training the model, and updates the model based on every subsequent batch of data from the stream.
- data
DStream containing labeled data
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since( "1.1.0" )