Class

org.apache.spark.mllib.evaluation

BinaryClassificationMetrics

Related Doc: package evaluation

Permalink

class BinaryClassificationMetrics extends Logging

Evaluator for binary classification.

Annotations
@Since( "1.0.0" )
Source
BinaryClassificationMetrics.scala
Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BinaryClassificationMetrics
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new BinaryClassificationMetrics(scoreAndLabels: RDD[(Double, Double)])

    Permalink

    Defaults numBins to 0.

    Defaults numBins to 0.

    Annotations
    @Since( "1.0.0" )
  2. new BinaryClassificationMetrics(scoreAndLabels: RDD[(Double, Double)], numBins: Int)

    Permalink

    scoreAndLabels

    an RDD of (score, label) pairs.

    numBins

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins". If 0, no down-sampling will occur. This is useful because the curve contains a point for each distinct score in the input, and this could be as large as the input itself -- millions of points or more, when thousands may be entirely sufficient to summarize the curve. After down-sampling, the curves will instead be made of approximately numBins points instead. Points are made from bins of equal numbers of consecutive points. The size of each bin is floor(scoreAndLabels.count() / numBins), which means the resulting number of bins may not exactly equal numBins. The last bin in each partition may be smaller as a result, meaning there may be an extra sample at partition boundaries.

    Annotations
    @Since( "1.3.0" )

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def areaUnderPR(): Double

    Permalink

    Computes the area under the precision-recall curve.

    Computes the area under the precision-recall curve.

    Annotations
    @Since( "1.0.0" )
  5. def areaUnderROC(): Double

    Permalink

    Computes the area under the receiver operating characteristic (ROC) curve.

    Computes the area under the receiver operating characteristic (ROC) curve.

    Annotations
    @Since( "1.0.0" )
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def fMeasureByThreshold(): RDD[(Double, Double)]

    Permalink

    Returns the (threshold, F-Measure) curve with beta = 1.0.

    Returns the (threshold, F-Measure) curve with beta = 1.0.

    Annotations
    @Since( "1.0.0" )
  11. def fMeasureByThreshold(beta: Double): RDD[(Double, Double)]

    Permalink

    Returns the (threshold, F-Measure) curve.

    Returns the (threshold, F-Measure) curve.

    beta

    the beta factor in F-Measure computation.

    returns

    an RDD of (threshold, F-Measure) pairs.

    Annotations
    @Since( "1.0.0" )
    See also

    F1 score (Wikipedia)

  12. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  14. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  15. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  18. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  19. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  20. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  21. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  22. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  23. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  24. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  25. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  26. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  27. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  28. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  29. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  30. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  31. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  32. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  33. val numBins: Int

    Permalink

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins".

    if greater than 0, then the curves (ROC curve, PR curve) computed internally will be down-sampled to this many "bins". If 0, no down-sampling will occur. This is useful because the curve contains a point for each distinct score in the input, and this could be as large as the input itself -- millions of points or more, when thousands may be entirely sufficient to summarize the curve. After down-sampling, the curves will instead be made of approximately numBins points instead. Points are made from bins of equal numbers of consecutive points. The size of each bin is floor(scoreAndLabels.count() / numBins), which means the resulting number of bins may not exactly equal numBins. The last bin in each partition may be smaller as a result, meaning there may be an extra sample at partition boundaries.

    Annotations
    @Since( "1.3.0" )
  34. def pr(): RDD[(Double, Double)]

    Permalink

    Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.

    Returns the precision-recall curve, which is an RDD of (recall, precision), NOT (precision, recall), with (0.0, 1.0) prepended to it.

    Annotations
    @Since( "1.0.0" )
    See also

    Precision and recall (Wikipedia)

  35. def precisionByThreshold(): RDD[(Double, Double)]

    Permalink

    Returns the (threshold, precision) curve.

    Returns the (threshold, precision) curve.

    Annotations
    @Since( "1.0.0" )
  36. def recallByThreshold(): RDD[(Double, Double)]

    Permalink

    Returns the (threshold, recall) curve.

    Returns the (threshold, recall) curve.

    Annotations
    @Since( "1.0.0" )
  37. def roc(): RDD[(Double, Double)]

    Permalink

    Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.

    Returns the receiver operating characteristic (ROC) curve, which is an RDD of (false positive rate, true positive rate) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.

    Annotations
    @Since( "1.0.0" )
    See also

    Receiver operating characteristic (Wikipedia)

  38. val scoreAndLabels: RDD[(Double, Double)]

    Permalink

    an RDD of (score, label) pairs.

    an RDD of (score, label) pairs.

    Annotations
    @Since( "1.3.0" )
  39. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  40. def thresholds(): RDD[Double]

    Permalink

    Returns thresholds in descending order.

    Returns thresholds in descending order.

    Annotations
    @Since( "1.0.0" )
  41. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  42. def unpersist(): Unit

    Permalink

    Unpersist intermediate RDDs used in the computation.

    Unpersist intermediate RDDs used in the computation.

    Annotations
    @Since( "1.0.0" )
  43. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped