KernelDensity

class pyspark.mllib.stat.KernelDensity[source]

Estimate probability density at required points given an RDD of samples from the population.

Examples

>>> kd = KernelDensity()
>>> sample = sc.parallelize([0.0, 1.0])
>>> kd.setSample(sample)
>>> kd.estimate([0.0, 1.0])
array([ 0.12938758,  0.12938758])

Methods

estimate(points)

Estimate the probability density at points

setBandwidth(bandwidth)

Set bandwidth of each sample.

setSample(sample)

Set sample points from the population.

Methods Documentation

estimate(points)[source]

Estimate the probability density at points

setBandwidth(bandwidth)[source]

Set bandwidth of each sample. Defaults to 1.0

setSample(sample)[source]

Set sample points from the population. Should be a RDD