trait Impurity extends Serializable
Trait for calculating information gain. This trait is used for (a) setting the impurity parameter in org.apache.spark.mllib.tree.configuration.Strategy (b) calculating impurity values from sufficient statistics.
- Annotations
- @Since( "1.0.0" )
- Source
- Impurity.scala
- Alphabetic
- By Inheritance
- Impurity
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
-
abstract
def
calculate(count: Double, sum: Double, sumSquares: Double): Double
information calculation for regression
information calculation for regression
- count
number of instances
- sum
sum of labels
- sumSquares
summation of squares of the labels
- returns
information value, or 0 if count = 0
- Annotations
- @Since( "1.0.0" )
-
abstract
def
calculate(counts: Array[Double], totalCount: Double): Double
information calculation for multiclass classification
information calculation for multiclass classification
- counts
Array[Double] with counts for each label
- totalCount
sum of counts for all labels
- returns
information value, or 0 if totalCount = 0
- Annotations
- @Since( "1.1.0" )