public class WholeTextFileRDD extends NewHadoopRDD<String,String>
MapPartitionsRDD
, but passes in an InputSplit to
the given function rather than the index of the partition.NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD<U,T>, NewHadoopRDD.NewHadoopMapPartitionsWithSplitRDD$
Constructor and Description |
---|
WholeTextFileRDD(SparkContext sc,
Class<? extends WholeTextFileInputFormat> inputFormatClass,
Class<String> keyClass,
Class<String> valueClass,
org.apache.hadoop.conf.Configuration conf,
int minPartitions) |
Modifier and Type | Method and Description |
---|---|
Partition[] |
getPartitions()
Implemented by subclasses to return the set of partitions in this RDD.
|
compute, getConf, getPreferredLocations, mapPartitionsWithInputSplit
aggregate, cache, cartesian, checkpoint, checkpointData, coalesce, collect, collect, collectPartitions, computeOrReadCheckpoint, conf, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, creationSite, dependencies, distinct, distinct, doCheckpoint, elementClassTag, filter, filterWith, first, flatMap, flatMapWith, fold, foreach, foreachPartition, foreachWith, getCheckpointFile, getCreationSite, getNarrowAncestors, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, iterator, keyBy, map, mapPartitions, mapPartitionsWithContext, mapPartitionsWithIndex, mapPartitionsWithSplit, mapWith, markCheckpointed, max, min, name, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, reduce, repartition, retag, retag, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toArray, toDebugString, toJavaRDD, toLocalIterator, top, toString, union, unpersist, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipWithIndex, zipWithUniqueId
firstAvailableClass, newJobContext, newTaskAttemptContext, newTaskAttemptID
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning
public WholeTextFileRDD(SparkContext sc, Class<? extends WholeTextFileInputFormat> inputFormatClass, Class<String> keyClass, Class<String> valueClass, org.apache.hadoop.conf.Configuration conf, int minPartitions)
public Partition[] getPartitions()
RDD
getPartitions
in class NewHadoopRDD<String,String>