Package org.apache.spark.mllib.util
Class LogisticRegressionDataGenerator
Object
org.apache.spark.mllib.util.LogisticRegressionDataGenerator
Generate test data for LogisticRegression. This class chooses positive labels
with probability
probOne
and scales features for positive examples by eps
.-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic RDD<LabeledPoint>
generateLogisticRDD
(SparkContext sc, int nexamples, int nfeatures, double eps, int nparts, double probOne) Generate an RDD containing test data for LogisticRegression.static void
-
Constructor Details
-
LogisticRegressionDataGenerator
public LogisticRegressionDataGenerator()
-
-
Method Details
-
generateLogisticRDD
public static RDD<LabeledPoint> generateLogisticRDD(SparkContext sc, int nexamples, int nfeatures, double eps, int nparts, double probOne) Generate an RDD containing test data for LogisticRegression.- Parameters:
sc
- SparkContext to use for creating the RDD.nexamples
- Number of examples that will be contained in the RDD.nfeatures
- Number of features to generate for each example.eps
- Epsilon factor by which positive examples are scaled.nparts
- Number of partitions of the generated RDD. Default value is 2.probOne
- Probability that a label is 1 (and not 0). Default value is 0.5.- Returns:
- (undocumented)
-
main
-