LinearDataGenerator¶
-
class
pyspark.mllib.util.
LinearDataGenerator
[source]¶ Utils for generating linear data.
New in version 1.5.0.
Methods
generateLinearInput
(intercept, weights, …)New in version 1.5.0.
generateLinearRDD
(sc, nexamples, nfeatures, eps)Generate an RDD of LabeledPoints.
Methods Documentation
-
static
generateLinearInput
(intercept: float, weights: VectorLike, xMean: VectorLike, xVariance: VectorLike, nPoints: int, seed: int, eps: float) → List[LabeledPoint][source]¶ New in version 1.5.0.
- Parameters
- interceptfloat
bias factor, the term c in X’w + c
- weights
pyspark.mllib.linalg.Vector
or convertible feature vector, the term w in X’w + c
- xMean
pyspark.mllib.linalg.Vector
or convertible Point around which the data X is centered.
- xVariance
pyspark.mllib.linalg.Vector
or convertible Variance of the given data
- nPointsint
Number of points to be generated
- seedint
Random Seed
- epsfloat
Used to scale the noise. If eps is set high, the amount of gaussian noise added is more.
- Returns
- list
of
pyspark.mllib.regression.LabeledPoints
of length nPoints
-
static