org.apache.spark.mllib.util

LinearDataGenerator

object LinearDataGenerator

:: DeveloperApi :: Generate sample data used for Linear Data. This class generates uniformly random values for every feature and adds Gaussian noise with mean eps to the response variable Y.

Annotations
@DeveloperApi() @Since( "0.8.0" )
Source
LinearDataGenerator.scala
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. LinearDataGenerator
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double, sparsity: Double): Seq[LabeledPoint]

    intercept

    Data intercept

    weights

    Weights to be applied.

    xMean

    the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.

    xVariance

    the variance of the generated features.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    sparsity

    The ratio of zero elements. If it is 0.0, LabeledPoints with DenseVector is returned.

    returns

    Seq of input.

    Annotations
    @Since( "1.6.0" )
  12. def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double): Seq[LabeledPoint]

    intercept

    Data intercept

    weights

    Weights to be applied.

    xMean

    the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.

    xVariance

    the variance of the generated features.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    returns

    Seq of input.

    Annotations
    @Since( "0.8.0" )
  13. def generateLinearInput(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double = 0.1): Seq[LabeledPoint]

    For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.

    For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)2 / 12 which will be (1.0/3.0)

    intercept

    Data intercept

    weights

    Weights to be applied.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    returns

    Seq of input.

    Annotations
    @Since( "0.8.0" )
  14. def generateLinearInputAsList(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double): List[LabeledPoint]

    Return a Java List of synthetic data randomly generated according to a multi collinear model.

    Return a Java List of synthetic data randomly generated according to a multi collinear model.

    intercept

    Data intercept

    weights

    Weights to be applied.

    nPoints

    Number of points in sample.

    seed

    Random seed

    returns

    Java List of input.

    Annotations
    @Since( "0.8.0" )
  15. def generateLinearRDD(sc: SparkContext, nexamples: Int, nfeatures: Int, eps: Double, nparts: Int = 2, intercept: Double = 0.0): RDD[LabeledPoint]

    Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.

    Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and uregularized variants.

    sc

    SparkContext to be used for generating the RDD.

    nexamples

    Number of examples that will be contained in the RDD.

    nfeatures

    Number of features to generate for each example.

    eps

    Epsilon factor by which examples are scaled.

    nparts

    Number of partitions in the RDD. Default value is 2.

    returns

    RDD of LabeledPoint containing sample data.

    Annotations
    @Since( "0.8.0" )
  16. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  17. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  18. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  19. def main(args: Array[String]): Unit

    Annotations
    @Since( "0.8.0" )
  20. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  21. final def notify(): Unit

    Definition Classes
    AnyRef
  22. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  24. def toString(): String

    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped