class StreamingLinearRegressionWithSGD extends StreamingLinearAlgorithm[LinearRegressionModel, LinearRegressionWithSGD] with Serializable
Train or predict a linear regression model on streaming data. Training uses
Stochastic Gradient Descent to update the model based on each new batch of
incoming data from a DStream (see LinearRegressionWithSGD
for model equation)
Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.
Use a builder pattern to construct a streaming linear regression analysis in an application, like:
val model = new StreamingLinearRegressionWithSGD() .setStepSize(0.5) .setNumIterations(10) .setInitialWeights(Vectors.dense(...)) .trainOn(DStream)
- Annotations
- @Since("1.1.0")
- Source
- StreamingLinearRegressionWithSGD.scala
- Alphabetic
- By Inheritance
- StreamingLinearRegressionWithSGD
- Serializable
- StreamingLinearAlgorithm
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new StreamingLinearRegressionWithSGD()
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}.
Construct a StreamingLinearRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0}. Initial weights must be set before using trainOn or predictOn (see
StreamingLinearAlgorithm
)- Annotations
- @Since("1.1.0")
Type Members
- implicit class LogStringContext extends AnyRef
- Definition Classes
- Logging
Value Members
- val algorithm: LinearRegressionWithSGD
The algorithm to use for updating.
The algorithm to use for updating.
- Definition Classes
- StreamingLinearRegressionWithSGD → StreamingLinearAlgorithm
- Annotations
- @Since("1.1.0")
- def latestModel(): LinearRegressionModel
Return the latest model.
Return the latest model.
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.1.0")
- def predictOn(data: JavaDStream[Vector]): JavaDStream[Double]
Java-friendly version of
predictOn
.Java-friendly version of
predictOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.3.0")
- def predictOn(data: DStream[Vector]): DStream[Double]
Use the model to make predictions on batches of data from a DStream
Use the model to make predictions on batches of data from a DStream
- data
DStream containing feature vectors
- returns
DStream containing predictions
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.1.0")
- def predictOnValues[K](data: JavaPairDStream[K, Vector]): JavaPairDStream[K, Double]
Java-friendly version of
predictOnValues
.Java-friendly version of
predictOnValues
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.3.0")
- def predictOnValues[K](data: DStream[(K, Vector)])(implicit arg0: ClassTag[K]): DStream[(K, Double)]
Use the model to make predictions on the values of a DStream and carry over its keys.
Use the model to make predictions on the values of a DStream and carry over its keys.
- K
key type
- data
DStream containing feature vectors
- returns
DStream containing the input keys and the predictions as values
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.1.0")
- def setConvergenceTol(tolerance: Double): StreamingLinearRegressionWithSGD.this.type
Set the convergence tolerance.
Set the convergence tolerance. Default: 0.001.
- Annotations
- @Since("1.5.0")
- def setInitialWeights(initialWeights: Vector): StreamingLinearRegressionWithSGD.this.type
Set the initial weights.
Set the initial weights.
- Annotations
- @Since("1.1.0")
- def setMiniBatchFraction(miniBatchFraction: Double): StreamingLinearRegressionWithSGD.this.type
Set the fraction of each batch to use for updates.
Set the fraction of each batch to use for updates. Default: 1.0.
- Annotations
- @Since("1.1.0")
- def setNumIterations(numIterations: Int): StreamingLinearRegressionWithSGD.this.type
Set the number of iterations of gradient descent to run per update.
Set the number of iterations of gradient descent to run per update. Default: 50.
- Annotations
- @Since("1.1.0")
- def setRegParam(regParam: Double): StreamingLinearRegressionWithSGD.this.type
Set the regularization parameter.
Set the regularization parameter. Default: 0.0.
- Annotations
- @Since("2.0.0")
- def setStepSize(stepSize: Double): StreamingLinearRegressionWithSGD.this.type
Set the step size for gradient descent.
Set the step size for gradient descent. Default: 0.1.
- Annotations
- @Since("1.1.0")
- def trainOn(data: JavaDStream[LabeledPoint]): Unit
Java-friendly version of
trainOn
.Java-friendly version of
trainOn
.- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.3.0")
- def trainOn(data: DStream[LabeledPoint]): Unit
Update the model by training on batches of data from a DStream.
Update the model by training on batches of data from a DStream. This operation registers a DStream for training the model, and updates the model based on every subsequent batch of data from the stream.
- data
DStream containing labeled data
- Definition Classes
- StreamingLinearAlgorithm
- Annotations
- @Since("1.1.0")