Class Strategy
Object
org.apache.spark.mllib.tree.configuration.Strategy
- All Implemented Interfaces:
- Serializable
Stores all the configuration options for tree construction
 param:  algo  Learning goal.  Supported:
              
org.apache.spark.mllib.tree.configuration.Algo.Classification,
              org.apache.spark.mllib.tree.configuration.Algo.Regression
 param:  impurity Criterion used for information gain calculation.
                 Supported for Classification: Gini,
                  Entropy.
                 Supported for Regression: Variance.
 param:  maxDepth Maximum depth of the tree (e.g. depth 0 means 1 leaf node, depth 1 means
                 1 internal node + 2 leaf nodes).
 param:  numClasses Number of classes for classification.
                                    (Ignored for regression.)
                                    Default value is 2 (binary classification).
 param:  maxBins Maximum number of bins used for discretizing continuous features and
                for choosing how to split on features at each node.
                More bins give higher granularity.
 param:  quantileCalculationStrategy Algorithm for calculating quantiles.  Supported:
                             org.apache.spark.mllib.tree.configuration.QuantileStrategy.Sort
 param:  categoricalFeaturesInfo A map storing information about the categorical variables and the
                                number of discrete values they take. An entry (n to k)
                                indicates that feature n is categorical with k categories
                                indexed from 0: {0, 1, ..., k-1}.
 param:  minInstancesPerNode Minimum number of instances each child must have after split.
                            Default value is 1. If a split cause left or right child
                            to have less than minInstancesPerNode,
                            this split will not be considered as a valid split.
 param:  minInfoGain Minimum information gain a split must get. Default value is 0.0.
                    If a split has less information gain than minInfoGain,
                    this split will not be considered as a valid split.
 param:  maxMemoryInMB Maximum memory in MB allocated to histogram aggregation. Default value is
                      256 MB.  If too small, then 1 node will be split per iteration, and
                      its aggregates may exceed this size.
 param:  subsamplingRate Fraction of the training data used for learning decision tree.
 param:  useNodeIdCache If this is true, instead of passing trees to executors, the algorithm will
                       maintain a separate RDD of node Id cache for each row.
 param:  checkpointInterval How often to checkpoint when the node Id cache gets updated.
                           E.g. 10 means that the cache will get checkpointed every 10 updates. If
                           the checkpoint directory is not set in
                           SparkContext, this setting is ignored.- See Also:
- 
Constructor SummaryConstructorsConstructorDescriptionStrategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, Map<Integer, Integer> categoricalFeaturesInfo) Java-friendly constructor forStrategyStrategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, scala.Enumeration.Value quantileCalculationStrategy, scala.collection.immutable.Map<Object, Object> categoricalFeaturesInfo, int minInstancesPerNode, double minInfoGain, int maxMemoryInMB, double subsamplingRate, boolean useNodeIdCache, int checkpointInterval) Backwards compatible constructor forStrategyStrategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, scala.Enumeration.Value quantileCalculationStrategy, scala.collection.immutable.Map<Object, Object> categoricalFeaturesInfo, int minInstancesPerNode, double minInfoGain, int maxMemoryInMB, double subsamplingRate, boolean useNodeIdCache, int checkpointInterval, double minWeightFractionPerNode, boolean bootstrap) 
- 
Method SummaryModifier and TypeMethodDescriptionscala.Enumeration.Valuealgo()intcopy()Returns a shallow copy of this instance.static StrategydefaultStrategy(String algo) Construct a default set of parameters forDecisionTreestatic StrategydefaultStrategy(scala.Enumeration.Value algo) Construct a default set of parameters forDecisionTreescala.Enumeration.ValuegetAlgo()intintintintdoubleintdoubleintscala.Enumeration.Valuedoublebooleanimpurity()booleanbooleanintmaxBins()intmaxDepth()intdoubleintdoubleintscala.Enumeration.ValuevoidSets Algorithm using a String.voidsetAlgo(scala.Enumeration.Value x$1) voidsetCategoricalFeaturesInfo(Map<Integer, Integer> categoricalFeaturesInfo) Sets categoricalFeaturesInfo using a Java Map.voidsetCategoricalFeaturesInfo(scala.collection.immutable.Map<Object, Object> x$1) voidsetCheckpointInterval(int x$1) voidsetImpurity(Impurity x$1) voidsetMaxBins(int x$1) voidsetMaxDepth(int x$1) voidsetMaxMemoryInMB(int x$1) voidsetMinInfoGain(double x$1) voidsetMinInstancesPerNode(int x$1) voidsetMinWeightFractionPerNode(double x$1) voidsetNumClasses(int x$1) voidsetQuantileCalculationStrategy(scala.Enumeration.Value x$1) voidsetSubsamplingRate(double x$1) voidsetUseNodeIdCache(boolean x$1) doubleboolean
- 
Constructor Details- 
Strategypublic Strategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, scala.Enumeration.Value quantileCalculationStrategy, scala.collection.immutable.Map<Object, Object> categoricalFeaturesInfo, int minInstancesPerNode, double minInfoGain, int maxMemoryInMB, double subsamplingRate, boolean useNodeIdCache, int checkpointInterval, double minWeightFractionPerNode, boolean bootstrap) 
- 
Strategypublic Strategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, scala.Enumeration.Value quantileCalculationStrategy, scala.collection.immutable.Map<Object, Object> categoricalFeaturesInfo, int minInstancesPerNode, double minInfoGain, int maxMemoryInMB, double subsamplingRate, boolean useNodeIdCache, int checkpointInterval) Backwards compatible constructor forStrategy- Parameters:
- algo- (undocumented)
- impurity- (undocumented)
- maxDepth- (undocumented)
- numClasses- (undocumented)
- maxBins- (undocumented)
- quantileCalculationStrategy- (undocumented)
- categoricalFeaturesInfo- (undocumented)
- minInstancesPerNode- (undocumented)
- minInfoGain- (undocumented)
- maxMemoryInMB- (undocumented)
- subsamplingRate- (undocumented)
- useNodeIdCache- (undocumented)
- checkpointInterval- (undocumented)
 
- 
Strategypublic Strategy(scala.Enumeration.Value algo, Impurity impurity, int maxDepth, int numClasses, int maxBins, Map<Integer, Integer> categoricalFeaturesInfo) Java-friendly constructor forStrategy- Parameters:
- algo- (undocumented)
- impurity- (undocumented)
- maxDepth- (undocumented)
- numClasses- (undocumented)
- maxBins- (undocumented)
- categoricalFeaturesInfo- (undocumented)
 
 
- 
- 
Method Details- 
defaultStrategyConstruct a default set of parameters forDecisionTree- Parameters:
- algo- "Classification" or "Regression"
- Returns:
- (undocumented)
 
- 
defaultStrategyConstruct a default set of parameters forDecisionTree- Parameters:
- algo- Algo.Classification or Algo.Regression
- Returns:
- (undocumented)
 
- 
algopublic scala.Enumeration.Value algo()
- 
impurity
- 
maxDepthpublic int maxDepth()
- 
numClassespublic int numClasses()
- 
maxBinspublic int maxBins()
- 
quantileCalculationStrategypublic scala.Enumeration.Value quantileCalculationStrategy()
- 
categoricalFeaturesInfo
- 
minInstancesPerNodepublic int minInstancesPerNode()
- 
minInfoGainpublic double minInfoGain()
- 
maxMemoryInMBpublic int maxMemoryInMB()
- 
subsamplingRatepublic double subsamplingRate()
- 
useNodeIdCachepublic boolean useNodeIdCache()
- 
checkpointIntervalpublic int checkpointInterval()
- 
minWeightFractionPerNodepublic double minWeightFractionPerNode()
- 
isMulticlassClassificationpublic boolean isMulticlassClassification()- Returns:
- (undocumented)
 
- 
isMulticlassWithCategoricalFeaturespublic boolean isMulticlassWithCategoricalFeatures()- Returns:
- (undocumented)
 
- 
setAlgoSets Algorithm using a String.- Parameters:
- algo- (undocumented)
 
- 
setCategoricalFeaturesInfoSets categoricalFeaturesInfo using a Java Map.- Parameters:
- categoricalFeaturesInfo- (undocumented)
 
- 
copyReturns a shallow copy of this instance.- Returns:
- (undocumented)
 
- 
getAlgopublic scala.Enumeration.Value getAlgo()
- 
getCategoricalFeaturesInfo
- 
getCheckpointIntervalpublic int getCheckpointInterval()
- 
getImpurity
- 
getMaxBinspublic int getMaxBins()
- 
getMaxDepthpublic int getMaxDepth()
- 
getMaxMemoryInMBpublic int getMaxMemoryInMB()
- 
getMinInfoGainpublic double getMinInfoGain()
- 
getMinInstancesPerNodepublic int getMinInstancesPerNode()
- 
getMinWeightFractionPerNodepublic double getMinWeightFractionPerNode()
- 
getNumClassespublic int getNumClasses()
- 
getQuantileCalculationStrategypublic scala.Enumeration.Value getQuantileCalculationStrategy()
- 
getSubsamplingRatepublic double getSubsamplingRate()
- 
getUseNodeIdCachepublic boolean getUseNodeIdCache()
- 
setAlgopublic void setAlgo(scala.Enumeration.Value x$1) 
- 
setCategoricalFeaturesInfo
- 
setCheckpointIntervalpublic void setCheckpointInterval(int x$1) 
- 
setImpurity
- 
setMaxBinspublic void setMaxBins(int x$1) 
- 
setMaxDepthpublic void setMaxDepth(int x$1) 
- 
setMaxMemoryInMBpublic void setMaxMemoryInMB(int x$1) 
- 
setMinInfoGainpublic void setMinInfoGain(double x$1) 
- 
setMinInstancesPerNodepublic void setMinInstancesPerNode(int x$1) 
- 
setMinWeightFractionPerNodepublic void setMinWeightFractionPerNode(double x$1) 
- 
setNumClassespublic void setNumClasses(int x$1) 
- 
setQuantileCalculationStrategypublic void setQuantileCalculationStrategy(scala.Enumeration.Value x$1) 
- 
setSubsamplingRatepublic void setSubsamplingRate(double x$1) 
- 
setUseNodeIdCachepublic void setUseNodeIdCache(boolean x$1) 
 
-