Class RRDD<T>

Object
org.apache.spark.rdd.RDD<U>
org.apache.spark.api.r.BaseRRDD<T,byte[]>
org.apache.spark.api.r.RRDD<T>
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, scala.Serializable

public class RRDD<T> extends BaseRRDD<T,byte[]>
An RDD that stores serialized R objects as Array[Byte].
See Also:
  • Constructor Details

    • RRDD

      public RRDD(RDD<T> parent, byte[] func, String deserializer, String serializer, byte[] packageNames, Object[] broadcastVars, scala.reflect.ClassTag<T> evidence$4)
  • Method Details

    • createSparkContext

      public static JavaSparkContext createSparkContext(String master, String appName, String sparkHome, String[] jars, Map<Object,Object> sparkEnvirMap, Map<Object,Object> sparkExecutorEnvMap)
    • createRDDFromArray

      public static JavaRDD<byte[]> createRDDFromArray(JavaSparkContext jsc, byte[][] arr)
      Create an RRDD given a sequence of byte arrays. Used to create RRDD when parallelize is called from R.
      Parameters:
      jsc - (undocumented)
      arr - (undocumented)
      Returns:
      (undocumented)
    • createRDDFromFile

      public static JavaRDD<byte[]> createRDDFromFile(JavaSparkContext jsc, String fileName, int parallelism)
      Create an RRDD given a temporary file name. This is used to create RRDD when parallelize is called on large R objects.

      Parameters:
      fileName - name of temporary file on driver machine
      parallelism - number of slices defaults to 4
      jsc - (undocumented)
      Returns:
      (undocumented)
    • asJavaRDD

      public JavaRDD<byte[]> asJavaRDD()