org.apache.spark.mllib.util
Class LogisticRegressionDataGenerator
Object
org.apache.spark.mllib.util.LogisticRegressionDataGenerator
public class LogisticRegressionDataGenerator
- extends Object
:: DeveloperApi ::
Generate test data for LogisticRegression. This class chooses positive labels
with probability probOne
and scales features for positive examples by eps
.
Methods inherited from class Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LogisticRegressionDataGenerator
public LogisticRegressionDataGenerator()
generateLogisticRDD
public static RDD<LabeledPoint> generateLogisticRDD(SparkContext sc,
int nexamples,
int nfeatures,
double eps,
int nparts,
double probOne)
- Generate an RDD containing test data for LogisticRegression.
- Parameters:
sc
- SparkContext to use for creating the RDD.nexamples
- Number of examples that will be contained in the RDD.nfeatures
- Number of features to generate for each example.eps
- Epsilon factor by which positive examples are scaled.nparts
- Number of partitions of the generated RDD. Default value is 2.probOne
- Probability that a label is 1 (and not 0). Default value is 0.5.
- Returns:
- (undocumented)
main
public static void main(String[] args)