Interface DecisionTreeParams

All Superinterfaces:
HasCheckpointInterval, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasSeed, HasWeightCol, Identifiable, Params, PredictorParams, Serializable, scala.Serializable
All Known Subinterfaces:
DecisionTreeClassifierParams, DecisionTreeRegressorParams, GBTClassifierParams, GBTParams, GBTRegressorParams, RandomForestClassifierParams, RandomForestParams, RandomForestRegressorParams, TreeEnsembleClassifierParams, TreeEnsembleParams, TreeEnsembleRegressorParams
All Known Implementing Classes:
DecisionTreeClassificationModel, DecisionTreeClassifier, DecisionTreeRegressionModel, DecisionTreeRegressor, GBTClassificationModel, GBTClassifier, GBTRegressionModel, GBTRegressor, RandomForestClassificationModel, RandomForestClassifier, RandomForestRegressionModel, RandomForestRegressor

public interface DecisionTreeParams extends PredictorParams, HasCheckpointInterval, HasSeed, HasWeightCol
Parameters for Decision Tree-based algorithms.

Note: Marked as private since this may be made public in the future.

  • Method Details

    • cacheNodeIds

      BooleanParam cacheNodeIds()
      If false, the algorithm will pass trees to executors to match instances with nodes. If true, the algorithm will cache node IDs for each instance. Caching can speed up training of deeper trees. Users can set how often should the cache be checkpointed or disable it by setting checkpointInterval. (default = false)
    • getCacheNodeIds

      boolean getCacheNodeIds()
    • getLeafCol

      String getLeafCol()
    • getMaxBins

      int getMaxBins()
    • getMaxDepth

      int getMaxDepth()
    • getMaxMemoryInMB

      int getMaxMemoryInMB()
    • getMinInfoGain

      double getMinInfoGain()
    • getMinInstancesPerNode

      int getMinInstancesPerNode()
    • getMinWeightFractionPerNode

      double getMinWeightFractionPerNode()
    • getOldStrategy

      Strategy getOldStrategy(scala.collection.immutable.Map<Object,Object> categoricalFeatures, int numClasses, scala.Enumeration.Value oldAlgo, Impurity oldImpurity, double subsamplingRate)
      (private[ml]) Create a Strategy instance to use with the old API.
    • leafCol

      Param<String> leafCol()
      Leaf indices column name. Predicted leaf index of each instance in each tree by preorder. (default = "")
    • maxBins

      IntParam maxBins()
      Maximum number of bins used for discretizing continuous features and for choosing how to split on features at each node. More bins give higher granularity. Must be at least 2 and at least number of categories in any categorical feature. (default = 32)
    • maxDepth

      IntParam maxDepth()
      Maximum depth of the tree (nonnegative). E.g., depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes. (default = 5)
    • maxMemoryInMB

      IntParam maxMemoryInMB()
      Maximum memory in MB allocated to histogram aggregation. If too small, then 1 node will be split per iteration, and its aggregates may exceed this size. (default = 256 MB)
    • minInfoGain

      DoubleParam minInfoGain()
      Minimum information gain for a split to be considered at a tree node. Should be at least 0.0. (default = 0.0)
    • minInstancesPerNode

      IntParam minInstancesPerNode()
      Minimum number of instances each child must have after split. If a split causes the left or right child to have fewer than minInstancesPerNode, the split will be discarded as invalid. Must be at least 1. (default = 1)
    • minWeightFractionPerNode

      DoubleParam minWeightFractionPerNode()
      Minimum fraction of the weighted sample count that each child must have after split. If a split causes the fraction of the total weight in the left or right child to be less than minWeightFractionPerNode, the split will be discarded as invalid. Should be in the interval [0.0, 0.5). (default = 0.0)
    • setLeafCol

      DecisionTreeParams setLeafCol(String value)