Class StreamingLogisticRegressionWithSGD
Object
org.apache.spark.mllib.regression.StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>
org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGD
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
public class StreamingLogisticRegressionWithSGD
extends StreamingLinearAlgorithm<LogisticRegressionModel,LogisticRegressionWithSGD>
implements Serializable
Train or predict a logistic regression model on streaming data. Training uses
Stochastic Gradient Descent to update the model based on each new batch of
incoming data from a DStream (see
LogisticRegressionWithSGD
for model equation)
Each batch of data is assumed to be an RDD of LabeledPoints. The number of data points per batch can vary, but the number of features must be constant. An initial weight vector must be provided.
Use a builder pattern to construct a streaming logistic regression analysis in an application, like:
val model = new StreamingLogisticRegressionWithSGD()
.setStepSize(0.5)
.setNumIterations(10)
.setInitialWeights(Vectors.dense(...))
.trainOn(DStream)
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
ConstructorDescriptionConstruct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. -
Method Summary
Modifier and TypeMethodDescriptionsetInitialWeights
(Vector initialWeights) Set the initial weights.setMiniBatchFraction
(double miniBatchFraction) Set the fraction of each batch to use for updates.setNumIterations
(int numIterations) Set the number of iterations of gradient descent to run per update.setRegParam
(double regParam) Set the regularization parameter.setStepSize
(double stepSize) Set the step size for gradient descent.Methods inherited from class org.apache.spark.mllib.regression.StreamingLinearAlgorithm
latestModel, predictOn, predictOn, predictOnValues, predictOnValues, trainOn, trainOn
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
-
Constructor Details
-
StreamingLogisticRegressionWithSGD
public StreamingLogisticRegressionWithSGD()Construct a StreamingLogisticRegression object with default parameters: {stepSize: 0.1, numIterations: 50, miniBatchFraction: 1.0, regParam: 0.0}. Initial weights must be set before using trainOn or predictOn (seeStreamingLinearAlgorithm
)
-
-
Method Details
-
setInitialWeights
Set the initial weights. Default: [0.0, 0.0]. -
setMiniBatchFraction
Set the fraction of each batch to use for updates. Default: 1.0. -
setNumIterations
Set the number of iterations of gradient descent to run per update. Default: 50. -
setRegParam
Set the regularization parameter. Default: 0.0. -
setStepSize
Set the step size for gradient descent. Default: 0.1.
-