Class GradientBoostedTreesModel
Object
org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
- All Implemented Interfaces:
- Serializable,- Saveable
Represents a gradient boosted trees model.
 
param: algo algorithm for the ensemble model, either Classification or Regression param: trees tree ensembles param: treeWeights tree ensemble weights
- See Also:
- 
Constructor SummaryConstructorsConstructorDescriptionGradientBoostedTreesModel(scala.Enumeration.Value algo, DecisionTreeModel[] trees, double[] treeWeights) 
- 
Method SummaryModifier and TypeMethodDescriptionscala.Enumeration.Valuealgo()computeInitialPredictionAndError(RDD<LabeledPoint> data, double initTreeWeight, DecisionTreeModel initTree, Loss loss) Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.double[]evaluateEachIteration(RDD<LabeledPoint> data, Loss loss) Method to compute error or loss for every iteration of gradient boosting.static GradientBoostedTreesModelload(SparkContext sc, String path) static org.apache.spark.internal.Logging.LogStringContextLogStringContext(scala.StringContext sc) intnumTrees()Get number of trees in ensemble.static org.slf4j.Loggerstatic voidorg$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) Java-friendly version oforg.apache.spark.mllib.tree.model.TreeEnsembleModel.predict.doublePredict values for a single data point using the model trained.Predict values for the given data set.voidsave(SparkContext sc, String path) Save this model to the given path.Print the full model to a string.toString()Print a summary of the model.intGet total number of nodes, summed over all trees in the ensemble.trees()double[]updatePredictionError(RDD<LabeledPoint> data, RDD<scala.Tuple2<Object, Object>> predictionAndError, double treeWeight, DecisionTreeModel tree, Loss loss) Update a zipped predictionError RDD (as obtained with computeInitialPredictionAndError)
- 
Constructor Details- 
GradientBoostedTreesModelpublic GradientBoostedTreesModel(scala.Enumeration.Value algo, DecisionTreeModel[] trees, double[] treeWeights) 
 
- 
- 
Method Details- 
computeInitialPredictionAndErrorpublic static RDD<scala.Tuple2<Object,Object>> computeInitialPredictionAndError(RDD<LabeledPoint> data, double initTreeWeight, DecisionTreeModel initTree, Loss loss) Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.- Parameters:
- data- : training data.
- initTreeWeight- : learning rate assigned to the first tree.
- initTree- : first DecisionTreeModel.
- loss- : evaluation metric.
- Returns:
- an RDD with each element being a zip of the prediction and error corresponding to every sample.
 
- 
updatePredictionErrorpublic static RDD<scala.Tuple2<Object,Object>> updatePredictionError(RDD<LabeledPoint> data, RDD<scala.Tuple2<Object, Object>> predictionAndError, double treeWeight, DecisionTreeModel tree, Loss loss) Update a zipped predictionError RDD (as obtained with computeInitialPredictionAndError)- Parameters:
- data- : training data.
- predictionAndError- : predictionError RDD
- treeWeight- : Learning rate.
- tree- : Tree using which the prediction and error should be updated.
- loss- : evaluation metric.
- Returns:
- an RDD with each element being a zip of the prediction and error corresponding to each sample.
 
- 
load- Parameters:
- sc- Spark context used for loading model files.
- path- Path specifying the directory to which the model was saved.
- Returns:
- Model instance
 
- 
algopublic scala.Enumeration.Value algo()
- 
trees
- 
treeWeightspublic double[] treeWeights()
- 
saveDescription copied from interface:SaveableSave this model to the given path.This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/ The model may be loaded using Loader.load.
- 
evaluateEachIterationMethod to compute error or loss for every iteration of gradient boosting.- Parameters:
- data- RDD of- LabeledPoint
- loss- evaluation metric.
- Returns:
- an array with index i having the losses or errors for the ensemble containing the first i+1 trees
 
- 
org$apache$spark$internal$Logging$$log_public static org.slf4j.Logger org$apache$spark$internal$Logging$$log_()
- 
org$apache$spark$internal$Logging$$log__$eqpublic static void org$apache$spark$internal$Logging$$log__$eq(org.slf4j.Logger x$1) 
- 
LogStringContextpublic static org.apache.spark.internal.Logging.LogStringContext LogStringContext(scala.StringContext sc) 
- 
predictPredict values for a single data point using the model trained.- Parameters:
- features- array representing a single data point
- Returns:
- predicted category from the trained model
 
- 
predictPredict values for the given data set.- Parameters:
- features- RDD representing data points to be predicted
- Returns:
- RDD[Double] where each entry contains the corresponding prediction
 
- 
predictJava-friendly version oforg.apache.spark.mllib.tree.model.TreeEnsembleModel.predict.- Parameters:
- features- (undocumented)
- Returns:
- (undocumented)
 
- 
toStringPrint a summary of the model.
- 
toDebugStringPrint the full model to a string.- Returns:
- (undocumented)
 
- 
numTreespublic int numTrees()Get number of trees in ensemble.- Returns:
- (undocumented)
 
- 
totalNumNodespublic int totalNumNodes()Get total number of nodes, summed over all trees in the ensemble.- Returns:
- (undocumented)
 
 
-