pyspark.RDD.union

RDD.union(other)[source]

Return the union of this RDD and another one.

Examples

>>> rdd = sc.parallelize([1, 1, 2, 3])
>>> rdd.union(rdd).collect()
[1, 1, 2, 3, 1, 1, 2, 3]