LogisticRegressionWithSGD#
- class pyspark.mllib.classification.LogisticRegressionWithSGD[source]#
Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent.
New in version 0.9.0.
Deprecated since version 2.0.0: Use ml.classification.LogisticRegression or LogisticRegressionWithLBFGS.
Methods
train
(data[, iterations, step, ...])Train a logistic regression model on the given data.
Methods Documentation
- classmethod train(data, iterations=100, step=1.0, miniBatchFraction=1.0, initialWeights=None, regParam=0.01, regType='l2', intercept=False, validateData=True, convergenceTol=0.001)[source]#
Train a logistic regression model on the given data.
New in version 0.9.0.
- Parameters
- data
pyspark.RDD
The training data, an RDD of
pyspark.mllib.regression.LabeledPoint
.- iterationsint, optional
The number of iterations. (default: 100)
- stepfloat, optional
The step parameter used in SGD. (default: 1.0)
- miniBatchFractionfloat, optional
Fraction of data to be used for each SGD iteration. (default: 1.0)
- initialWeights
pyspark.mllib.linalg.Vector
or convertible, optional The initial weights. (default: None)
- regParamfloat, optional
The regularizer parameter. (default: 0.01)
- regTypestr, optional
The type of regularizer used for training our model. Supported values:
“l1” for using L1 regularization
“l2” for using L2 regularization (default)
None for no regularization
- interceptbool, optional
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e., whether bias features are activated or not). (default: False)
- validateDatabool, optional
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
- convergenceTolfloat, optional
A condition which decides iteration termination. (default: 0.001)
- data