LinearDataGenerator#
- class pyspark.mllib.util.LinearDataGenerator[source]#
- Utils for generating linear data. - New in version 1.5.0. - Methods - generateLinearInput(intercept, weights, ...)- New in version 1.5.0. - generateLinearRDD(sc, nexamples, nfeatures, eps)- Generate an RDD of LabeledPoints. - Methods Documentation - static generateLinearInput(intercept, weights, xMean, xVariance, nPoints, seed, eps)[source]#
- New in version 1.5.0. - Parameters
- interceptfloat
- bias factor, the term c in X’w + c 
- weightspyspark.mllib.linalg.Vectoror convertible
- feature vector, the term w in X’w + c 
- xMeanpyspark.mllib.linalg.Vectoror convertible
- Point around which the data X is centered. 
- xVariancepyspark.mllib.linalg.Vectoror convertible
- Variance of the given data 
- nPointsint
- Number of points to be generated 
- seedint
- Random Seed 
- epsfloat
- Used to scale the noise. If eps is set high, the amount of gaussian noise added is more. 
 
- Returns
- list
- of - pyspark.mllib.regression.LabeledPointsof length nPoints