org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm<SVMModel>

org.apache.spark.mllib.classification.SVMWithSGD

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging

public class SVMWithSGD extends GeneralizedLinearAlgorithm<SVMModel> implements Serializable

Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. By default L2 regularization is used, which can be changed via SVMWithSGD.optimizer.

See Also:

Serialized Form

Note:

Labels used in SVM should be {0, 1}.

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

SVMWithSGD()

Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
Method Summary

Modifier and Type

Method

Description

GradientDescent

optimizer()

The optimizer to solve the problem.

static SVMModel

train(RDD<LabeledPoint> input, int numIterations)

Train a SVM model given an RDD of (label, features) pairs.

static SVMModel

train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam)

Train a SVM model given an RDD of (label, features) pairs.

static SVMModel

train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam, double miniBatchFraction)

Train a SVM model given an RDD of (label, features) pairs.

static SVMModel

train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam, double miniBatchFraction, Vector initialWeights)

Train a SVM model given an RDD of (label, features) pairs.

Methods inherited from class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm
getNumFeatures, isAddIntercept, run, run, setIntercept, setValidateData

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- SVMWithSGD
  
  public SVMWithSGD()
  
  Construct a SVM object with default parameters: {stepSize: 1.0, numIterations: 100, regParam: 0.01, miniBatchFraction: 1.0}.
Method Details
- train
  
  public static SVMModel train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam, double miniBatchFraction, Vector initialWeights)
  
  Train a SVM model given an RDD of (label, features) pairs. We run a fixed number of iterations of gradient descent using the specified step size. Each iteration uses miniBatchFraction fraction of the data to calculate the gradient. The weights used in gradient descent are initialized using the initial weights provided.
  
  Parameters:
  
  input - RDD of (label, array of features) pairs.
  
  numIterations - Number of iterations of gradient descent to run.
  
  stepSize - Step size to be used for each iteration of gradient descent.
  
  regParam - Regularization parameter.
  
  miniBatchFraction - Fraction of data to be used per iteration.
  
  initialWeights - Initial set of weights to be used. Array should be equal in size to the number of features in the data.
  
  Returns:
  
  (undocumented)
  
  Note:
  
  Labels used in SVM should be {0, 1}.
- train
  
  public static SVMModel train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam, double miniBatchFraction)
  
  Train a SVM model given an RDD of (label, features) pairs. We run a fixed number of iterations of gradient descent using the specified step size. Each iteration uses miniBatchFraction fraction of the data to calculate the gradient.
  
  Parameters:
  
  input - RDD of (label, array of features) pairs.
  
  numIterations - Number of iterations of gradient descent to run.
  
  stepSize - Step size to be used for each iteration of gradient descent.
  
  regParam - Regularization parameter.
  
  miniBatchFraction - Fraction of data to be used per iteration.
  
  Returns:
  
  (undocumented)
  
  Note:
  
  Labels used in SVM should be {0, 1}
- train
  
  public static SVMModel train(RDD<LabeledPoint> input, int numIterations, double stepSize, double regParam)
  
  Train a SVM model given an RDD of (label, features) pairs. We run a fixed number of iterations of gradient descent using the specified step size. We use the entire data set to update the gradient in each iteration.
  
  Parameters:
  
  input - RDD of (label, array of features) pairs.
  
  stepSize - Step size to be used for each iteration of Gradient Descent.
  
  regParam - Regularization parameter.
  
  numIterations - Number of iterations of gradient descent to run.
  
  Returns:
  
  a SVMModel which has the weights and offset from training.
  
  Note:
  
  Labels used in SVM should be {0, 1}
- train
  
  public static SVMModel train(RDD<LabeledPoint> input, int numIterations)
  
  Train a SVM model given an RDD of (label, features) pairs. We run a fixed number of iterations of gradient descent using a step size of 1.0. We use the entire data set to update the gradient in each iteration.
  
  Parameters:
  
  input - RDD of (label, array of features) pairs.
  
  numIterations - Number of iterations of gradient descent to run.
  
  Returns:
  
  a SVMModel which has the weights and offset from training.
  
  Note:
  
  Labels used in SVM should be {0, 1}
- optimizer
  
  public GradientDescent optimizer()
  
  Description copied from class: GeneralizedLinearAlgorithm
  
  The optimizer to solve the problem.
  
  Specified by:
  
  optimizer in class GeneralizedLinearAlgorithm<SVMModel>
  
  Returns:
  
  (undocumented)

Class SVMWithSGD

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

SVMWithSGD

Method Details

train

train

train

train

optimizer