public class JavaDoubleRDD
extends Object
Constructor and Description 

JavaDoubleRDD(RDD<Object> srdd) 
Modifier and Type  Method and Description 

JavaDoubleRDD 
cache()
Persist this RDD with the default storage level (`MEMORY_ONLY`).

scala.reflect.ClassTag<Double> 
classTag() 
JavaDoubleRDD 
coalesce(int numPartitions)
Return a new RDD that is reduced into
numPartitions partitions. 
JavaDoubleRDD 
coalesce(int numPartitions,
boolean shuffle)
Return a new RDD that is reduced into
numPartitions partitions. 
JavaDoubleRDD 
distinct()
Return a new RDD containing the distinct elements in this RDD.

JavaDoubleRDD 
distinct(int numPartitions)
Return a new RDD containing the distinct elements in this RDD.

JavaDoubleRDD 
filter(Function<Double,Boolean> f)
Return a new RDD containing only the elements that satisfy a predicate.

Double 
first()
Return the first element in this RDD.

static JavaDoubleRDD 
fromRDD(RDD<Object> rdd) 
long[] 
histogram(double[] buckets)
Compute a histogram using the provided buckets.

long[] 
histogram(Double[] buckets,
boolean evenBuckets) 
scala.Tuple2<double[],long[]> 
histogram(int bucketCount)
Compute a histogram of the data using bucketCount number of buckets evenly
spaced between the minimum and maximum of the RDD.

JavaDoubleRDD 
intersection(JavaDoubleRDD other)
Return the intersection of this RDD and another one.

Double 
max()
Returns the maximum element from this RDD as defined by
the default comparator natural order.

Double 
mean()
Compute the mean of this RDD's elements.

PartialResult<BoundedDouble> 
meanApprox(long timeout)
:: Experimental ::
Approximate operation to return the mean within a timeout.

PartialResult<BoundedDouble> 
meanApprox(long timeout,
Double confidence)
Return the approximate mean of the elements in this RDD.

Double 
min()
Returns the minimum element from this RDD as defined by
the default comparator natural order.

JavaDoubleRDD 
persist(StorageLevel newLevel)
Set this RDD's storage level to persist its values across operations after the first time
it is computed.

RDD<Double> 
rdd() 
JavaDoubleRDD 
repartition(int numPartitions)
Return a new RDD that has exactly numPartitions partitions.

JavaDoubleRDD 
sample(boolean withReplacement,
Double fraction)
Return a sampled subset of this RDD.

JavaDoubleRDD 
sample(boolean withReplacement,
Double fraction,
long seed)
Return a sampled subset of this RDD.

Double 
sampleStdev()
Compute the sample standard deviation of this RDD's elements (which corrects for bias in
estimating the standard deviation by dividing by N1 instead of N).

Double 
sampleVariance()
Compute the sample variance of this RDD's elements (which corrects for bias in
estimating the standard variance by dividing by N1 instead of N).

JavaDoubleRDD 
setName(String name)
Assign a name to this RDD

RDD<Object> 
srdd() 
StatCounter 
stats()
Return a
StatCounter object that captures the mean, variance and
count of the RDD's elements in one operation. 
Double 
stdev()
Compute the standard deviation of this RDD's elements.

JavaDoubleRDD 
subtract(JavaDoubleRDD other)
Return an RDD with the elements from
this that are not in other . 
JavaDoubleRDD 
subtract(JavaDoubleRDD other,
int numPartitions)
Return an RDD with the elements from
this that are not in other . 
JavaDoubleRDD 
subtract(JavaDoubleRDD other,
Partitioner p)
Return an RDD with the elements from
this that are not in other . 
Double 
sum()
Add up the elements in this RDD.

PartialResult<BoundedDouble> 
sumApprox(long timeout)
:: Experimental ::
Approximate operation to return the sum within a timeout.

PartialResult<BoundedDouble> 
sumApprox(long timeout,
Double confidence)
:: Experimental ::
Approximate operation to return the sum within a timeout.

static RDD<Object> 
toRDD(JavaDoubleRDD rdd) 
JavaDoubleRDD 
union(JavaDoubleRDD other)
Return the union of this RDD and another one.

JavaDoubleRDD 
unpersist()
Mark the RDD as nonpersistent, and remove all blocks for it from memory and disk.

JavaDoubleRDD 
unpersist(boolean blocking)
Mark the RDD as nonpersistent, and remove all blocks for it from memory and disk.

Double 
variance()
Compute the variance of this RDD's elements.

JavaDoubleRDD 
wrapRDD(RDD<Double> rdd) 
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
aggregate, cartesian, checkpoint, collect, collectAsync, collectPartitions, context, count, countApprox, countApprox, countApproxDistinct, countAsync, countByValue, countByValueApprox, countByValueApprox, flatMap, flatMapToDouble, flatMapToPair, fold, foreach, foreachAsync, foreachPartition, foreachPartitionAsync, getCheckpointFile, getStorageLevel, glom, groupBy, groupBy, id, isCheckpointed, isEmpty, iterator, keyBy, map, mapPartitions, mapPartitions, mapPartitionsToDouble, mapPartitionsToDouble, mapPartitionsToPair, mapPartitionsToPair, mapPartitionsWithIndex, mapToDouble, mapToPair, max, min, name, partitions, pipe, pipe, pipe, reduce, saveAsObjectFile, saveAsTextFile, saveAsTextFile, splits, take, takeAsync, takeOrdered, takeOrdered, takeSample, takeSample, toArray, toDebugString, toLocalIterator, top, top, treeAggregate, treeAggregate, treeReduce, treeReduce, zip, zipPartitions, zipWithIndex, zipWithUniqueId
public JavaDoubleRDD(RDD<Object> srdd)
public static JavaDoubleRDD fromRDD(RDD<Object> rdd)
public static RDD<Object> toRDD(JavaDoubleRDD rdd)
public RDD<Object> srdd()
public scala.reflect.ClassTag<Double> classTag()
public RDD<Double> rdd()
public JavaDoubleRDD wrapRDD(RDD<Double> rdd)
public JavaDoubleRDD cache()
public JavaDoubleRDD persist(StorageLevel newLevel)
newLevel
 (undocumented)public JavaDoubleRDD unpersist()
public JavaDoubleRDD unpersist(boolean blocking)
blocking
 Whether to block until all blocks are deleted.public Double first()
JavaRDDLike
public JavaDoubleRDD distinct()
public JavaDoubleRDD distinct(int numPartitions)
numPartitions
 (undocumented)public JavaDoubleRDD filter(Function<Double,Boolean> f)
f
 (undocumented)public JavaDoubleRDD coalesce(int numPartitions)
numPartitions
partitions.numPartitions
 (undocumented)public JavaDoubleRDD coalesce(int numPartitions, boolean shuffle)
numPartitions
partitions.numPartitions
 (undocumented)shuffle
 (undocumented)public JavaDoubleRDD repartition(int numPartitions)
Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data.
If you are decreasing the number of partitions in this RDD, consider using coalesce
,
which can avoid performing a shuffle.
numPartitions
 (undocumented)public JavaDoubleRDD subtract(JavaDoubleRDD other)
this
that are not in other
.
Uses this
partitioner/partition size, because even if other
is huge, the resulting
RDD will be <= us.
other
 (undocumented)public JavaDoubleRDD subtract(JavaDoubleRDD other, int numPartitions)
this
that are not in other
.other
 (undocumented)numPartitions
 (undocumented)public JavaDoubleRDD subtract(JavaDoubleRDD other, Partitioner p)
this
that are not in other
.other
 (undocumented)p
 (undocumented)public JavaDoubleRDD sample(boolean withReplacement, Double fraction)
withReplacement
 (undocumented)fraction
 (undocumented)public JavaDoubleRDD sample(boolean withReplacement, Double fraction, long seed)
withReplacement
 (undocumented)fraction
 (undocumented)seed
 (undocumented)public JavaDoubleRDD union(JavaDoubleRDD other)
.distinct()
to eliminate them).other
 (undocumented)public JavaDoubleRDD intersection(JavaDoubleRDD other)
Note that this method performs a shuffle internally.
other
 (undocumented)public Double sum()
public Double min()
public Double max()
public StatCounter stats()
StatCounter
object that captures the mean, variance and
count of the RDD's elements in one operation.public Double mean()
public Double variance()
public Double stdev()
public Double sampleStdev()
public Double sampleVariance()
public PartialResult<BoundedDouble> meanApprox(long timeout, Double confidence)
public PartialResult<BoundedDouble> meanApprox(long timeout)
timeout
 (undocumented)public PartialResult<BoundedDouble> sumApprox(long timeout, Double confidence)
timeout
 (undocumented)confidence
 (undocumented)public PartialResult<BoundedDouble> sumApprox(long timeout)
timeout
 (undocumented)public scala.Tuple2<double[],long[]> histogram(int bucketCount)
bucketCount
 (undocumented)public long[] histogram(double[] buckets)
Note: if your histogram is evenly spaced (e.g. [0, 10, 20, 30]) this can be switched from an O(log n) insertion to O(1) per element. (where n = # buckets) if you set evenBuckets to true. buckets must be sorted and not contain any duplicates. buckets array must be at least two elements All NaN entries are treated the same. If you have a NaN bucket it must be the maximum value of the last position and all NaN entries will be counted in that bucket.
buckets
 (undocumented)public long[] histogram(Double[] buckets, boolean evenBuckets)
public JavaDoubleRDD setName(String name)