org.apache.spark.ml.classification
Class GBTClassifier

Object
  extended by org.apache.spark.ml.PipelineStage
      extended by org.apache.spark.ml.Estimator<M>
          extended by org.apache.spark.ml.Predictor<Vector,GBTClassifier,GBTClassificationModel>
              extended by org.apache.spark.ml.classification.GBTClassifier
All Implemented Interfaces:
java.io.Serializable, Logging, Params

public final class GBTClassifier
extends Predictor<Vector,GBTClassifier,GBTClassificationModel>
implements Logging

:: Experimental :: Gradient-Boosted Trees (GBTs) learning algorithm for classification. It supports binary labels, as well as both continuous and categorical features. Note: Multiclass labels are not currently supported.

See Also:
Serialized Form

Constructor Summary
GBTClassifier()
           
GBTClassifier(String uid)
           
 
Method Summary
 GBTClassifier copy(ParamMap extra)
          Creates a copy of this instance with the same UID and some extra params.
 String getLossType()
           
 Param<String> lossType()
          Loss function which GBT tries to minimize.
 GBTClassifier setCacheNodeIds(boolean value)
           
 GBTClassifier setCheckpointInterval(int value)
           
 GBTClassifier setImpurity(String value)
          The impurity setting is ignored for GBT models.
 GBTClassifier setLossType(String value)
           
 GBTClassifier setMaxBins(int value)
           
 GBTClassifier setMaxDepth(int value)
           
 GBTClassifier setMaxIter(int value)
           
 GBTClassifier setMaxMemoryInMB(int value)
           
 GBTClassifier setMinInfoGain(double value)
           
 GBTClassifier setMinInstancesPerNode(int value)
           
 GBTClassifier setSeed(long value)
           
 GBTClassifier setStepSize(double value)
           
 GBTClassifier setSubsamplingRate(double value)
           
static String[] supportedLossTypes()
          Accessor for supported loss settings: logistic
 String uid()
           
 StructType validateAndTransformSchema(StructType schema, boolean fitting, DataType featuresDataType)
          Validates and transforms the input schema with the provided param map.
 
Methods inherited from class org.apache.spark.ml.Predictor
fit, setFeaturesCol, setLabelCol, setPredictionCol, transformSchema
 
Methods inherited from class org.apache.spark.ml.Estimator
fit, fit, fit, fit
 
Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.spark.Logging
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
 
Methods inherited from interface org.apache.spark.ml.param.Params
clear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, setDefault, shouldOwn, validateParams
 

Constructor Detail

GBTClassifier

public GBTClassifier(String uid)

GBTClassifier

public GBTClassifier()
Method Detail

supportedLossTypes

public static final String[] supportedLossTypes()
Accessor for supported loss settings: logistic


uid

public String uid()

setMaxDepth

public GBTClassifier setMaxDepth(int value)

setMaxBins

public GBTClassifier setMaxBins(int value)

setMinInstancesPerNode

public GBTClassifier setMinInstancesPerNode(int value)

setMinInfoGain

public GBTClassifier setMinInfoGain(double value)

setMaxMemoryInMB

public GBTClassifier setMaxMemoryInMB(int value)

setCacheNodeIds

public GBTClassifier setCacheNodeIds(boolean value)

setCheckpointInterval

public GBTClassifier setCheckpointInterval(int value)

setImpurity

public GBTClassifier setImpurity(String value)
The impurity setting is ignored for GBT models. Individual trees are built using impurity "Variance."

Parameters:
value - (undocumented)
Returns:
(undocumented)

setSubsamplingRate

public GBTClassifier setSubsamplingRate(double value)

setSeed

public GBTClassifier setSeed(long value)

setMaxIter

public GBTClassifier setMaxIter(int value)

setStepSize

public GBTClassifier setStepSize(double value)

lossType

public Param<String> lossType()
Loss function which GBT tries to minimize. (case-insensitive) Supported: "logistic" (default = logistic)

Returns:
(undocumented)

setLossType

public GBTClassifier setLossType(String value)

getLossType

public String getLossType()

copy

public GBTClassifier copy(ParamMap extra)
Description copied from interface: Params
Creates a copy of this instance with the same UID and some extra params. Subclasses should implement this method and set the return type properly.

Specified by:
copy in interface Params
Specified by:
copy in class Predictor<Vector,GBTClassifier,GBTClassificationModel>
Parameters:
extra - (undocumented)
Returns:
(undocumented)
See Also:
defaultCopy()

validateAndTransformSchema

public StructType validateAndTransformSchema(StructType schema,
                                             boolean fitting,
                                             DataType featuresDataType)
Validates and transforms the input schema with the provided param map.

Parameters:
schema - input schema
fitting - whether this is in fitting
featuresDataType - SQL DataType for FeaturesType. E.g., VectorUDT for vector features.
Returns:
output schema