Class KernelDensity

Object
org.apache.spark.mllib.stat.KernelDensity
All Implemented Interfaces:
Serializable

public class KernelDensity extends Object implements Serializable
Kernel density estimation. Given a sample from a population, estimate its probability density function at each of the given evaluation points using kernels. Only Gaussian kernel is supported.

Scala example:


 val sample = sc.parallelize(Seq(0.0, 1.0, 4.0, 4.0))
 val kd = new KernelDensity()
   .setSample(sample)
   .setBandwidth(3.0)
 val densities = kd.estimate(Array(-1.0, 2.0, 5.0))
 
See Also:
  • Constructor Details

    • KernelDensity

      public KernelDensity()
  • Method Details

    • normPdf

      public static double normPdf(double mean, double standardDeviation, double logStandardDeviationPlusHalfLog2Pi, double x)
      Evaluates the PDF of a normal distribution.
    • setBandwidth

      public KernelDensity setBandwidth(double bandwidth)
      Sets the bandwidth (standard deviation) of the Gaussian kernel (default: 1.0).
      Parameters:
      bandwidth - (undocumented)
      Returns:
      (undocumented)
    • setSample

      public KernelDensity setSample(RDD<Object> sample)
      Sets the sample to use for density estimation.
      Parameters:
      sample - (undocumented)
      Returns:
      (undocumented)
    • setSample

      public KernelDensity setSample(JavaRDD<Double> sample)
      Sets the sample to use for density estimation (for Java users).
      Parameters:
      sample - (undocumented)
      Returns:
      (undocumented)
    • estimate

      public double[] estimate(double[] points)
      Estimates probability density function at the given array of points.
      Parameters:
      points - (undocumented)
      Returns:
      (undocumented)