Package org.apache.spark.api.r
Class RRDD<T>
Object
org.apache.spark.rdd.RDD<U>
org.apache.spark.api.r.BaseRRDD<T,byte[]>
org.apache.spark.api.r.RRDD<T>
- All Implemented Interfaces:
Serializable
,org.apache.spark.internal.Logging
,scala.Serializable
An RDD that stores serialized R objects as Array[Byte].
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.SparkShellLoggingFilter
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionJavaRDD<byte[]>
static JavaRDD<byte[]>
createRDDFromArray
(JavaSparkContext jsc, byte[][] arr) Create an RRDD given a sequence of byte arrays.static JavaRDD<byte[]>
createRDDFromFile
(JavaSparkContext jsc, String fileName, int parallelism) Create an RRDD given a temporary file name.static JavaSparkContext
createSparkContext
(String master, String appName, String sparkHome, String[] jars, Map<Object, Object> sparkEnvirMap, Map<Object, Object> sparkExecutorEnvMap) Methods inherited from class org.apache.spark.api.r.BaseRRDD
compute, getPartitions
Methods inherited from class org.apache.spark.rdd.RDD
aggregate, barrier, cache, cartesian, checkpoint, cleanShuffleDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filter, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getResourceProfile, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithEvaluator, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeAggregate, treeReduce, union, unpersist, withResources, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitionsWithEvaluator, zipWithIndex, zipWithUniqueId
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq
-
Constructor Details
-
RRDD
-
-
Method Details
-
createSparkContext
-
createRDDFromArray
Create an RRDD given a sequence of byte arrays. Used to create RRDD whenparallelize
is called from R.- Parameters:
jsc
- (undocumented)arr
- (undocumented)- Returns:
- (undocumented)
-
createRDDFromFile
public static JavaRDD<byte[]> createRDDFromFile(JavaSparkContext jsc, String fileName, int parallelism) Create an RRDD given a temporary file name. This is used to create RRDD when parallelize is called on large R objects.- Parameters:
fileName
- name of temporary file on driver machineparallelism
- number of slices defaults to 4jsc
- (undocumented)- Returns:
- (undocumented)
-
asJavaRDD
-