Interface Loss

All Superinterfaces:
Serializable
All Known Subinterfaces:
ClassificationLoss

public interface Loss extends Serializable
Trait for adding "pluggable" loss functions for the gradient boosting algorithm.
  • Method Summary

    Modifier and Type
    Method
    Description
    double
    computeError(double prediction, double label)
    Method to calculate loss when the predictions are already known.
    double
    computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel model, RDD<LabeledPoint> data)
    Method to calculate error of the base learner for the gradient boosting calculation.
    double
    gradient(double prediction, double label)
    Method to calculate the gradients for the gradient boosting calculation.
  • Method Details

    • computeError

      double computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel model, RDD<LabeledPoint> data)
      Method to calculate error of the base learner for the gradient boosting calculation.

      Parameters:
      model - Model of the weak learner.
      data - Training dataset: RDD of LabeledPoint.
      Returns:
      Measure of model error on data

      Note:
      This method is not used by the gradient boosting algorithm but is useful for debugging purposes.
    • computeError

      double computeError(double prediction, double label)
      Method to calculate loss when the predictions are already known.

      Parameters:
      prediction - Predicted label.
      label - True label.
      Returns:
      Measure of model error on datapoint.

      Note:
      This method is used in the method evaluateEachIteration to avoid recomputing the predicted values from previously fit trees.
    • gradient

      double gradient(double prediction, double label)
      Method to calculate the gradients for the gradient boosting calculation.
      Parameters:
      prediction - Predicted feature
      label - true label.
      Returns:
      Loss gradient.