Package org.apache.spark.rdd
Class CoGroupedRDD<K>
Object
org.apache.spark.rdd.RDD<scala.Tuple2<K,scala.collection.Iterable<Object>[]>>
 
org.apache.spark.rdd.CoGroupedRDD<K>
- All Implemented Interfaces:
- Serializable,- org.apache.spark.internal.Logging
:: DeveloperApi ::
 An RDD that cogroups its parents. For each key k in parent RDDs, the resulting RDD contains a
 tuple with the list of values for that key.
 
param: rdds parent RDDs. param: part partitioner used to partition the shuffle output
- See Also:
- Note:
- This is an internal API. We recommend users use RDD.cogroup(...) instead of instantiating this directly.
- 
Nested Class SummaryNested classes/interfaces inherited from interface org.apache.spark.internal.Loggingorg.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
- 
Constructor SummaryConstructorsConstructorDescriptionCoGroupedRDD(scala.collection.immutable.Seq<RDD<? extends scala.Product2<K, ?>>> rdds, Partitioner part, scala.reflect.ClassTag<K> evidence$1) 
- 
Method SummaryModifier and TypeMethodDescriptionvoidcompute(Partition s, TaskContext context) :: DeveloperApi :: Implemented by subclasses to compute a given partition.scala.collection.immutable.Seq<Dependency<?>>scala.Some<Partitioner>Optionally overridden by subclasses to specify how they are partitioned.rdds()setSerializer(Serializer serializer) Set a serializer for this RDD's shuffle, or null to use the default (spark.serializer)Methods inherited from class org.apache.spark.rdd.RDDaggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueIdMethods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.spark.internal.LogginginitializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext
- 
Constructor Details- 
CoGroupedRDDpublic CoGroupedRDD(scala.collection.immutable.Seq<RDD<? extends scala.Product2<K, ?>>> rdds, Partitioner part, scala.reflect.ClassTag<K> evidence$1) 
 
- 
- 
Method Details- 
clearDependenciespublic void clearDependencies()
- 
computepublic scala.collection.Iterator<scala.Tuple2<K,scala.collection.Iterable<Object>[]>> compute(Partition s, TaskContext context) Description copied from class:RDD:: DeveloperApi :: Implemented by subclasses to compute a given partition.
- 
getDependencies
- 
getPartitions
- 
partitionerDescription copied from class:RDDOptionally overridden by subclasses to specify how they are partitioned.- Overrides:
- partitionerin class- RDD<scala.Tuple2<K,- scala.collection.Iterable<Object>[]>> 
 
- 
rdds
- 
setSerializerSet a serializer for this RDD's shuffle, or null to use the default (spark.serializer)
 
-