Class LinearSVC

All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, ClassifierParams, LinearSVCParams, Params, HasAggregationDepth, HasFeaturesCol, HasFitIntercept, HasLabelCol, HasMaxBlockSizeInMB, HasMaxIter, HasPredictionCol, HasRawPredictionCol, HasRegParam, HasStandardization, HasThreshold, HasTol, HasWeightCol, PredictorParams, DefaultParamsWritable, Identifiable, MLWritable

Linear SVM Classifier

This binary classifier optimizes the Hinge Loss using the OWLQN optimizer. Only supports L2 regularization currently.

Since 3.1.0, it supports stacking instances into blocks and using GEMV for better performance. The block size will be 1.0 MB, if param maxBlockSizeInMB is set 0.0 by default.

See Also:
  • Constructor Details

    • LinearSVC

      public LinearSVC(String uid)
    • LinearSVC

      public LinearSVC()
  • Method Details

    • load

      public static LinearSVC load(String path)
    • read

      public static MLReader<T> read()
    • threshold

      public final DoubleParam threshold()
      Description copied from interface: LinearSVCParams
      Param for threshold in binary classification prediction. For LinearSVC, this threshold is applied to the rawPrediction, rather than a probability. This threshold can be any real number, where Inf will make all predictions 0.0 and -Inf will make all predictions 1.0. Default: 0.0

      Specified by:
      threshold in interface HasThreshold
      Specified by:
      threshold in interface LinearSVCParams
      Returns:
      (undocumented)
    • maxBlockSizeInMB

      public final DoubleParam maxBlockSizeInMB()
      Description copied from interface: HasMaxBlockSizeInMB
      Param for Maximum memory in MB for stacking input data into blocks. Data is stacked within partitions. If more than remaining data size in a partition then it is adjusted to the data size. Default 0.0 represents choosing optimal value, depends on specific algorithm. Must be &gt;= 0..
      Specified by:
      maxBlockSizeInMB in interface HasMaxBlockSizeInMB
      Returns:
      (undocumented)
    • aggregationDepth

      public final IntParam aggregationDepth()
      Description copied from interface: HasAggregationDepth
      Param for suggested depth for treeAggregate (&gt;= 2).
      Specified by:
      aggregationDepth in interface HasAggregationDepth
      Returns:
      (undocumented)
    • weightCol

      public final Param<String> weightCol()
      Description copied from interface: HasWeightCol
      Param for weight column name. If this is not set or empty, we treat all instance weights as 1.0.
      Specified by:
      weightCol in interface HasWeightCol
      Returns:
      (undocumented)
    • standardization

      public final BooleanParam standardization()
      Description copied from interface: HasStandardization
      Param for whether to standardize the training features before fitting the model.
      Specified by:
      standardization in interface HasStandardization
      Returns:
      (undocumented)
    • tol

      public final DoubleParam tol()
      Description copied from interface: HasTol
      Param for the convergence tolerance for iterative algorithms (&gt;= 0).
      Specified by:
      tol in interface HasTol
      Returns:
      (undocumented)
    • fitIntercept

      public final BooleanParam fitIntercept()
      Description copied from interface: HasFitIntercept
      Param for whether to fit an intercept term.
      Specified by:
      fitIntercept in interface HasFitIntercept
      Returns:
      (undocumented)
    • maxIter

      public final IntParam maxIter()
      Description copied from interface: HasMaxIter
      Param for maximum number of iterations (&gt;= 0).
      Specified by:
      maxIter in interface HasMaxIter
      Returns:
      (undocumented)
    • regParam

      public final DoubleParam regParam()
      Description copied from interface: HasRegParam
      Param for regularization parameter (&gt;= 0).
      Specified by:
      regParam in interface HasRegParam
      Returns:
      (undocumented)
    • uid

      public String uid()
      Description copied from interface: Identifiable
      An immutable unique ID for the object and its derivatives.
      Specified by:
      uid in interface Identifiable
      Returns:
      (undocumented)
    • setRegParam

      public LinearSVC setRegParam(double value)
      Set the regularization parameter. Default is 0.0.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setMaxIter

      public LinearSVC setMaxIter(int value)
      Set the maximum number of iterations. Default is 100.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setFitIntercept

      public LinearSVC setFitIntercept(boolean value)
      Whether to fit an intercept term. Default is true.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setTol

      public LinearSVC setTol(double value)
      Set the convergence tolerance of iterations. Smaller values will lead to higher accuracy at the cost of more iterations. Default is 1E-6.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setStandardization

      public LinearSVC setStandardization(boolean value)
      Whether to standardize the training features before fitting the model. Default is true.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setWeightCol

      public LinearSVC setWeightCol(String value)
      Set the value of param weightCol(). If this is not set or empty, we treat all instance weights as 1.0. Default is not set, so all instances have weight one.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setThreshold

      public LinearSVC setThreshold(double value)
      Set threshold in binary classification.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setAggregationDepth

      public LinearSVC setAggregationDepth(int value)
      Suggested depth for treeAggregate (greater than or equal to 2). If the dimensions of features or the number of partitions are large, this param could be adjusted to a larger size. Default is 2.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • setMaxBlockSizeInMB

      public LinearSVC setMaxBlockSizeInMB(double value)
      Sets the value of param maxBlockSizeInMB(). Default is 0.0, then 1.0 MB will be chosen.

      Parameters:
      value - (undocumented)
      Returns:
      (undocumented)
    • copy

      public LinearSVC copy(ParamMap extra)
      Description copied from interface: Params
      Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly. See defaultCopy().
      Specified by:
      copy in interface Params
      Specified by:
      copy in class Predictor<Vector,LinearSVC,LinearSVCModel>
      Parameters:
      extra - (undocumented)
      Returns:
      (undocumented)