|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Object org.apache.spark.api.java.JavaRDD<T>
public class JavaRDD<T>
Constructor Summary | |
---|---|
JavaRDD(RDD<T> rdd,
scala.reflect.ClassTag<T> classTag)
|
Method Summary | ||
---|---|---|
JavaRDD<T> |
cache()
Persist this RDD with the default storage level (`MEMORY_ONLY`). |
|
scala.reflect.ClassTag<T> |
classTag()
|
|
JavaRDD<T> |
coalesce(int numPartitions)
Return a new RDD that is reduced into numPartitions partitions. |
|
JavaRDD<T> |
coalesce(int numPartitions,
boolean shuffle)
Return a new RDD that is reduced into numPartitions partitions. |
|
JavaRDD<T> |
distinct()
Return a new RDD containing the distinct elements in this RDD. |
|
JavaRDD<T> |
distinct(int numPartitions)
Return a new RDD containing the distinct elements in this RDD. |
|
JavaRDD<T> |
filter(Function<T,Boolean> f)
Return a new RDD containing only the elements that satisfy a predicate. |
|
static
|
fromRDD(RDD<T> rdd,
scala.reflect.ClassTag<T> evidence$1)
|
|
JavaRDD<T> |
intersection(JavaRDD<T> other)
Return the intersection of this RDD and another one. |
|
JavaRDD<T> |
persist(StorageLevel newLevel)
Set this RDD's storage level to persist its values across operations after the first time it is computed. |
|
JavaRDD<T>[] |
randomSplit(double[] weights)
Randomly splits this RDD with the provided weights. |
|
JavaRDD<T>[] |
randomSplit(double[] weights,
long seed)
Randomly splits this RDD with the provided weights. |
|
RDD<T> |
rdd()
|
|
JavaRDD<T> |
repartition(int numPartitions)
Return a new RDD that has exactly numPartitions partitions. |
|
JavaRDD<T> |
sample(boolean withReplacement,
double fraction)
Return a sampled subset of this RDD. |
|
JavaRDD<T> |
sample(boolean withReplacement,
double fraction,
long seed)
Return a sampled subset of this RDD. |
|
JavaRDD<T> |
setName(String name)
Assign a name to this RDD |
|
|
sortBy(Function<T,S> f,
boolean ascending,
int numPartitions)
Return this RDD sorted by the given key function. |
|
JavaRDD<T> |
subtract(JavaRDD<T> other)
Return an RDD with the elements from this that are not in other . |
|
JavaRDD<T> |
subtract(JavaRDD<T> other,
int numPartitions)
Return an RDD with the elements from this that are not in other . |
|
JavaRDD<T> |
subtract(JavaRDD<T> other,
Partitioner p)
Return an RDD with the elements from this that are not in other . |
|
static
|
toRDD(JavaRDD<T> rdd)
|
|
String |
toString()
|
|
JavaRDD<T> |
union(JavaRDD<T> other)
Return the union of this RDD and another one. |
|
JavaRDD<T> |
unpersist()
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk. |
|
JavaRDD<T> |
unpersist(boolean blocking)
Mark the RDD as non-persistent, and remove all blocks for it from memory and disk. |
|
JavaRDD<T> |
wrapRDD(RDD<T> rdd)
|
Methods inherited from class Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public JavaRDD(RDD<T> rdd, scala.reflect.ClassTag<T> classTag)
Method Detail |
---|
public static <T> JavaRDD<T> fromRDD(RDD<T> rdd, scala.reflect.ClassTag<T> evidence$1)
public static <T> RDD<T> toRDD(JavaRDD<T> rdd)
public RDD<T> rdd()
public scala.reflect.ClassTag<T> classTag()
public JavaRDD<T> wrapRDD(RDD<T> rdd)
public JavaRDD<T> cache()
public JavaRDD<T> persist(StorageLevel newLevel)
newLevel
- (undocumented)
public JavaRDD<T> unpersist()
public JavaRDD<T> unpersist(boolean blocking)
blocking
- Whether to block until all blocks are deleted.
public JavaRDD<T> distinct()
public JavaRDD<T> distinct(int numPartitions)
numPartitions
- (undocumented)
public JavaRDD<T> filter(Function<T,Boolean> f)
f
- (undocumented)
public JavaRDD<T> coalesce(int numPartitions)
numPartitions
partitions.
numPartitions
- (undocumented)
public JavaRDD<T> coalesce(int numPartitions, boolean shuffle)
numPartitions
partitions.
numPartitions
- (undocumented)shuffle
- (undocumented)
public JavaRDD<T> repartition(int numPartitions)
Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data.
If you are decreasing the number of partitions in this RDD, consider using coalesce
,
which can avoid performing a shuffle.
numPartitions
- (undocumented)
public JavaRDD<T> sample(boolean withReplacement, double fraction)
withReplacement
- can elements be sampled multiple times (replaced when sampled out)fraction
- expected size of the sample as a fraction of this RDD's size
without replacement: probability that each element is chosen; fraction must be [0, 1]
with replacement: expected number of times each element is chosen; fraction must be >= 0
public JavaRDD<T> sample(boolean withReplacement, double fraction, long seed)
withReplacement
- can elements be sampled multiple times (replaced when sampled out)fraction
- expected size of the sample as a fraction of this RDD's size
without replacement: probability that each element is chosen; fraction must be [0, 1]
with replacement: expected number of times each element is chosen; fraction must be >= 0seed
- seed for the random number generator
public JavaRDD<T>[] randomSplit(double[] weights)
weights
- weights for splits, will be normalized if they don't sum to 1
public JavaRDD<T>[] randomSplit(double[] weights, long seed)
weights
- weights for splits, will be normalized if they don't sum to 1seed
- random seed
public JavaRDD<T> union(JavaRDD<T> other)
.distinct()
to eliminate them).
other
- (undocumented)
public JavaRDD<T> intersection(JavaRDD<T> other)
Note that this method performs a shuffle internally.
other
- (undocumented)
public JavaRDD<T> subtract(JavaRDD<T> other)
this
that are not in other
.
Uses this
partitioner/partition size, because even if other
is huge, the resulting
RDD will be <= us.
other
- (undocumented)
public JavaRDD<T> subtract(JavaRDD<T> other, int numPartitions)
this
that are not in other
.
other
- (undocumented)numPartitions
- (undocumented)
public JavaRDD<T> subtract(JavaRDD<T> other, Partitioner p)
this
that are not in other
.
other
- (undocumented)p
- (undocumented)
public String toString()
toString
in class Object
public JavaRDD<T> setName(String name)
public <S> JavaRDD<T> sortBy(Function<T,S> f, boolean ascending, int numPartitions)
f
- (undocumented)ascending
- (undocumented)numPartitions
- (undocumented)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |