Package org.apache.spark.mllib.random
Class RandomRDDs
Object
org.apache.spark.mllib.random.RandomRDDs
Generator methods for creating RDDs comprised of
i.i.d.
samples from some distribution.-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic JavaDoubleRDD
exponentialJavaRDD
(JavaSparkContext jsc, double mean, long size) RandomRDDs.exponentialJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
exponentialJavaRDD
(JavaSparkContext jsc, double mean, long size, int numPartitions) RandomRDDs.exponentialJavaRDD
with the default seed.static JavaDoubleRDD
exponentialJavaRDD
(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.exponentialRDD
.exponentialJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols) RandomRDDs.exponentialJavaVectorRDD
with the default number of partitions and the default seed.exponentialJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions) RandomRDDs.exponentialJavaVectorRDD
with the default seed.exponentialJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.exponentialVectorRDD
.exponentialRDD
(SparkContext sc, double mean, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the exponential distribution with the input mean.exponentialVectorRDD
(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the exponential distribution with the input mean.static JavaDoubleRDD
gammaJavaRDD
(JavaSparkContext jsc, double shape, double scale, long size) RandomRDDs.gammaJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
gammaJavaRDD
(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions) RandomRDDs.gammaJavaRDD
with the default seed.static JavaDoubleRDD
gammaJavaRDD
(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.gammaRDD
.gammaJavaVectorRDD
(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols) RandomRDDs.gammaJavaVectorRDD
with the default number of partitions and the default seed.gammaJavaVectorRDD
(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions) RandomRDDs.gammaJavaVectorRDD
with the default seed.gammaJavaVectorRDD
(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.gammaVectorRDD
.gammaRDD
(SparkContext sc, double shape, double scale, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the gamma distribution with the input shape and scale.gammaVectorRDD
(SparkContext sc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the gamma distribution with the input shape and scale.static JavaDoubleRDD
logNormalJavaRDD
(JavaSparkContext jsc, double mean, double std, long size) RandomRDDs.logNormalJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
logNormalJavaRDD
(JavaSparkContext jsc, double mean, double std, long size, int numPartitions) RandomRDDs.logNormalJavaRDD
with the default seed.static JavaDoubleRDD
logNormalJavaRDD
(JavaSparkContext jsc, double mean, double std, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.logNormalRDD
.logNormalJavaVectorRDD
(JavaSparkContext jsc, double mean, double std, long numRows, int numCols) RandomRDDs.logNormalJavaVectorRDD
with the default number of partitions and the default seed.logNormalJavaVectorRDD
(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions) RandomRDDs.logNormalJavaVectorRDD
with the default seed.logNormalJavaVectorRDD
(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.logNormalVectorRDD
.logNormalRDD
(SparkContext sc, double mean, double std, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the log normal distribution with the input mean and standard deviationlogNormalVectorRDD
(SparkContext sc, double mean, double std, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from a log normal distribution.static JavaDoubleRDD
normalJavaRDD
(JavaSparkContext jsc, long size) RandomRDDs.normalJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
normalJavaRDD
(JavaSparkContext jsc, long size, int numPartitions) RandomRDDs.normalJavaRDD
with the default seed.static JavaDoubleRDD
normalJavaRDD
(JavaSparkContext jsc, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.normalRDD
.normalJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols) RandomRDDs.normalJavaVectorRDD
with the default number of partitions and the default seed.normalJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols, int numPartitions) RandomRDDs.normalJavaVectorRDD
with the default seed.normalJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.normalVectorRDD
.normalRDD
(SparkContext sc, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the standard normal distribution.normalVectorRDD
(SparkContext sc, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the standard normal distribution.static JavaDoubleRDD
poissonJavaRDD
(JavaSparkContext jsc, double mean, long size) RandomRDDs.poissonJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
poissonJavaRDD
(JavaSparkContext jsc, double mean, long size, int numPartitions) RandomRDDs.poissonJavaRDD
with the default seed.static JavaDoubleRDD
poissonJavaRDD
(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.poissonRDD
.poissonJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols) RandomRDDs.poissonJavaVectorRDD
with the default number of partitions and the default seed.poissonJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions) RandomRDDs.poissonJavaVectorRDD
with the default seed.poissonJavaVectorRDD
(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.poissonVectorRDD
.poissonRDD
(SparkContext sc, double mean, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the Poisson distribution with the input mean.poissonVectorRDD
(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the Poisson distribution with the input mean.static <T> JavaRDD<T>
randomJavaRDD
(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size) RandomRDDs.randomJavaRDD
with the default seed & numPartitionsstatic <T> JavaRDD<T>
randomJavaRDD
(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions) RandomRDDs.randomJavaRDD
with the default seed.static <T> JavaRDD<T>
randomJavaRDD
(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples produced by the input RandomDataGenerator.randomJavaVectorRDD
(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols) RandomRDDs.randomJavaVectorRDD
with the default number of partitions and the default seed.randomJavaVectorRDD
(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions) ::RandomRDDs.randomJavaVectorRDD
with the default seed.randomJavaVectorRDD
(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.randomVectorRDD
.static <T> RDD<T>
randomRDD
(SparkContext sc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed, scala.reflect.ClassTag<T> evidence$1) Generates an RDD comprised ofi.i.d.
samples produced by the input RandomDataGenerator.randomVectorRDD
(SparkContext sc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples produced by the input RandomDataGenerator.static JavaDoubleRDD
uniformJavaRDD
(JavaSparkContext jsc, long size) RandomRDDs.uniformJavaRDD
with the default number of partitions and the default seed.static JavaDoubleRDD
uniformJavaRDD
(JavaSparkContext jsc, long size, int numPartitions) RandomRDDs.uniformJavaRDD
with the default seed.static JavaDoubleRDD
uniformJavaRDD
(JavaSparkContext jsc, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.uniformRDD
.uniformJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols) RandomRDDs.uniformJavaVectorRDD
with the default number of partitions and the default seed.uniformJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols, int numPartitions) RandomRDDs.uniformJavaVectorRDD
with the default seed.uniformJavaVectorRDD
(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.uniformVectorRDD
.uniformRDD
(SparkContext sc, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the uniform distributionU(0.0, 1.0)
.uniformVectorRDD
(SparkContext sc, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the uniform distribution onU(0.0, 1.0)
.
-
Constructor Details
-
RandomRDDs
public RandomRDDs()
-
-
Method Details
-
uniformRDD
Generates an RDD comprised ofi.i.d.
samples from the uniform distributionU(0.0, 1.0)
.To transform the distribution in the generated RDD from
U(0.0, 1.0)
toU(a, b)
, useRandomRDDs.uniformRDD(sc, n, p, seed).map(v => a + (b - a) * v)
.- Parameters:
sc
- SparkContext used to create the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~U(0.0, 1.0)
.
-
uniformJavaRDD
public static JavaDoubleRDD uniformJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.uniformRDD
.- Parameters:
jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
uniformJavaRDD
RandomRDDs.uniformJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
uniformJavaRDD
RandomRDDs.uniformJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
normalRDD
Generates an RDD comprised ofi.i.d.
samples from the standard normal distribution.To transform the distribution in the generated RDD from standard normal to some other normal
N(mean, sigma^2^)
, useRandomRDDs.normalRDD(sc, n, p, seed).map(v => mean + sigma * v)
.- Parameters:
sc
- SparkContext used to create the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~ N(0.0, 1.0).
-
normalJavaRDD
public static JavaDoubleRDD normalJavaRDD(JavaSparkContext jsc, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.normalRDD
.- Parameters:
jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
normalJavaRDD
RandomRDDs.normalJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
normalJavaRDD
RandomRDDs.normalJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
poissonRDD
public static RDD<Object> poissonRDD(SparkContext sc, double mean, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the Poisson distribution with the input mean.- Parameters:
sc
- SparkContext used to create the RDD.mean
- Mean, or lambda, for the Poisson distribution.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~ Pois(mean).
-
poissonJavaRDD
public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.poissonRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
poissonJavaRDD
public static JavaDoubleRDD poissonJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions) RandomRDDs.poissonJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
poissonJavaRDD
RandomRDDs.poissonJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
exponentialRDD
public static RDD<Object> exponentialRDD(SparkContext sc, double mean, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the exponential distribution with the input mean.- Parameters:
sc
- SparkContext used to create the RDD.mean
- Mean, or 1 / lambda, for the exponential distribution.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~ Pois(mean).
-
exponentialJavaRDD
public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.exponentialRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
exponentialJavaRDD
public static JavaDoubleRDD exponentialJavaRDD(JavaSparkContext jsc, double mean, long size, int numPartitions) RandomRDDs.exponentialJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
exponentialJavaRDD
RandomRDDs.exponentialJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
gammaRDD
public static RDD<Object> gammaRDD(SparkContext sc, double shape, double scale, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the gamma distribution with the input shape and scale.- Parameters:
sc
- SparkContext used to create the RDD.shape
- shape parameter (greater than 0) for the gamma distributionscale
- scale parameter (greater than 0) for the gamma distributionsize
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~ Pois(mean).
-
gammaJavaRDD
public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.gammaRDD
.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
gammaJavaRDD
public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size, int numPartitions) RandomRDDs.gammaJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
gammaJavaRDD
public static JavaDoubleRDD gammaJavaRDD(JavaSparkContext jsc, double shape, double scale, long size) RandomRDDs.gammaJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
logNormalRDD
public static RDD<Object> logNormalRDD(SparkContext sc, double mean, double std, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples from the log normal distribution with the input mean and standard deviation- Parameters:
sc
- SparkContext used to create the RDD.mean
- mean for the log normal distributionstd
- standard deviation for the log normal distributionsize
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Double] comprised of
i.i.d.
samples ~ Pois(mean).
-
logNormalJavaRDD
public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions, long seed) Java-friendly version ofRandomRDDs.logNormalRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
logNormalJavaRDD
public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size, int numPartitions) RandomRDDs.logNormalJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
logNormalJavaRDD
public static JavaDoubleRDD logNormalJavaRDD(JavaSparkContext jsc, double mean, double std, long size) RandomRDDs.logNormalJavaRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
randomRDD
public static <T> RDD<T> randomRDD(SparkContext sc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed, scala.reflect.ClassTag<T> evidence$1) Generates an RDD comprised ofi.i.d.
samples produced by the input RandomDataGenerator.- Parameters:
sc
- SparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).evidence$1
- (undocumented)- Returns:
- RDD[T] comprised of
i.i.d.
samples produced by generator.
-
randomJavaRDD
public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions, long seed) Generates an RDD comprised ofi.i.d.
samples produced by the input RandomDataGenerator.- Parameters:
jsc
- JavaSparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.size
- Size of the RDD.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[T] comprised of
i.i.d.
samples produced by generator.
-
randomJavaRDD
public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size, int numPartitions) RandomRDDs.randomJavaRDD
with the default seed.- Parameters:
jsc
- (undocumented)generator
- (undocumented)size
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
randomJavaRDD
public static <T> JavaRDD<T> randomJavaRDD(JavaSparkContext jsc, RandomDataGenerator<T> generator, long size) RandomRDDs.randomJavaRDD
with the default seed & numPartitions- Parameters:
jsc
- (undocumented)generator
- (undocumented)size
- (undocumented)- Returns:
- (undocumented)
-
uniformVectorRDD
public static RDD<Vector> uniformVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the uniform distribution onU(0.0, 1.0)
.- Parameters:
sc
- SparkContext used to create the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD.seed
- Seed for the RNG that generates the seed for the generator in each partition.- Returns:
- RDD[Vector] with vectors containing i.i.d samples ~
U(0.0, 1.0)
.
-
uniformJavaVectorRDD
public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.uniformVectorRDD
.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
uniformJavaVectorRDD
public static JavaRDD<Vector> uniformJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions) RandomRDDs.uniformJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
uniformJavaVectorRDD
RandomRDDs.uniformJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
normalVectorRDD
public static RDD<Vector> normalVectorRDD(SparkContext sc, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the standard normal distribution.- Parameters:
sc
- SparkContext used to create the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples ~N(0.0, 1.0)
.
-
normalJavaVectorRDD
public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.normalVectorRDD
.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
normalJavaVectorRDD
public static JavaRDD<Vector> normalJavaVectorRDD(JavaSparkContext jsc, long numRows, int numCols, int numPartitions) RandomRDDs.normalJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
normalJavaVectorRDD
RandomRDDs.normalJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
logNormalVectorRDD
public static RDD<Vector> logNormalVectorRDD(SparkContext sc, double mean, double std, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from a log normal distribution.- Parameters:
sc
- SparkContext used to create the RDD.mean
- Mean of the log normal distribution.std
- Standard deviation of the log normal distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples.
-
logNormalJavaVectorRDD
public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.logNormalVectorRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
logNormalJavaVectorRDD
public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols, int numPartitions) RandomRDDs.logNormalJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
logNormalJavaVectorRDD
public static JavaRDD<Vector> logNormalJavaVectorRDD(JavaSparkContext jsc, double mean, double std, long numRows, int numCols) RandomRDDs.logNormalJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)std
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
poissonVectorRDD
public static RDD<Vector> poissonVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the Poisson distribution with the input mean.- Parameters:
sc
- SparkContext used to create the RDD.mean
- Mean, or lambda, for the Poisson distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
)seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples ~ Pois(mean).
-
poissonJavaVectorRDD
public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.poissonVectorRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
poissonJavaVectorRDD
public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions) RandomRDDs.poissonJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
poissonJavaVectorRDD
public static JavaRDD<Vector> poissonJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols) RandomRDDs.poissonJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
exponentialVectorRDD
public static RDD<Vector> exponentialVectorRDD(SparkContext sc, double mean, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the exponential distribution with the input mean.- Parameters:
sc
- SparkContext used to create the RDD.mean
- Mean, or 1 / lambda, for the Exponential distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
)seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples ~ Exp(mean).
-
exponentialJavaVectorRDD
public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.exponentialVectorRDD
.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
exponentialJavaVectorRDD
public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols, int numPartitions) RandomRDDs.exponentialJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
exponentialJavaVectorRDD
public static JavaRDD<Vector> exponentialJavaVectorRDD(JavaSparkContext jsc, double mean, long numRows, int numCols) RandomRDDs.exponentialJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)mean
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
gammaVectorRDD
public static RDD<Vector> gammaVectorRDD(SparkContext sc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples drawn from the gamma distribution with the input shape and scale.- Parameters:
sc
- SparkContext used to create the RDD.shape
- shape parameter (greater than 0) for the gamma distribution.scale
- scale parameter (greater than 0) for the gamma distribution.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
)seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples ~ Exp(mean).
-
gammaJavaVectorRDD
public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.gammaVectorRDD
.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
gammaJavaVectorRDD
public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols, int numPartitions) RandomRDDs.gammaJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
gammaJavaVectorRDD
public static JavaRDD<Vector> gammaJavaVectorRDD(JavaSparkContext jsc, double shape, double scale, long numRows, int numCols) RandomRDDs.gammaJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)shape
- (undocumented)scale
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-
randomVectorRDD
public static RDD<Vector> randomVectorRDD(SparkContext sc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed) Generates an RDD[Vector] with vectors containingi.i.d.
samples produced by the input RandomDataGenerator.- Parameters:
sc
- SparkContext used to create the RDD.generator
- RandomDataGenerator used to populate the RDD.numRows
- Number of Vectors in the RDD.numCols
- Number of elements in each Vector.numPartitions
- Number of partitions in the RDD (default:sc.defaultParallelism
).seed
- Random seed (default: a random long integer).- Returns:
- RDD[Vector] with vectors containing
i.i.d.
samples produced by generator.
-
randomJavaVectorRDD
public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions, long seed) Java-friendly version ofRandomRDDs.randomVectorRDD
.- Parameters:
jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)seed
- (undocumented)- Returns:
- (undocumented)
-
randomJavaVectorRDD
public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols, int numPartitions) ::RandomRDDs.randomJavaVectorRDD
with the default seed.- Parameters:
jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)numPartitions
- (undocumented)- Returns:
- (undocumented)
-
randomJavaVectorRDD
public static JavaRDD<Vector> randomJavaVectorRDD(JavaSparkContext jsc, RandomDataGenerator<Object> generator, long numRows, int numCols) RandomRDDs.randomJavaVectorRDD
with the default number of partitions and the default seed.- Parameters:
jsc
- (undocumented)generator
- (undocumented)numRows
- (undocumented)numCols
- (undocumented)- Returns:
- (undocumented)
-