Object

org.apache.spark.mllib.util

LinearDataGenerator

Related Doc: package util

Permalink

object LinearDataGenerator

:: DeveloperApi :: Generate sample data used for Linear Data. This class generates uniformly random values for every feature and adds Gaussian noise with mean eps to the response variable Y.

Annotations
@DeveloperApi() @Since( "0.8.0" )
Source
LinearDataGenerator.scala
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. LinearDataGenerator
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double, sparsity: Double): Seq[LabeledPoint]

    Permalink

    intercept

    Data intercept

    weights

    Weights to be applied.

    xMean

    the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.

    xVariance

    the variance of the generated features.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    sparsity

    The ratio of zero elements. If it is 0.0, LabeledPoints with DenseVector is returned.

    returns

    Seq of input.

    Annotations
    @Since( "1.6.0" )
  10. def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double): Seq[LabeledPoint]

    Permalink

    intercept

    Data intercept

    weights

    Weights to be applied.

    xMean

    the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.

    xVariance

    the variance of the generated features.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    returns

    Seq of input.

    Annotations
    @Since( "0.8.0" )
  11. def generateLinearInput(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double = 0.1): Seq[LabeledPoint]

    Permalink

    For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)2 / 12 which will be (1.0/3.0)

    For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)2 / 12 which will be (1.0/3.0)

    intercept

    Data intercept

    weights

    Weights to be applied.

    nPoints

    Number of points in sample.

    seed

    Random seed

    eps

    Epsilon scaling factor.

    returns

    Seq of input.

    Annotations
    @Since( "0.8.0" )
  12. def generateLinearInputAsList(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double): List[LabeledPoint]

    Permalink

    Return a Java List of synthetic data randomly generated according to a multi collinear model.

    Return a Java List of synthetic data randomly generated according to a multi collinear model.

    intercept

    Data intercept

    weights

    Weights to be applied.

    nPoints

    Number of points in sample.

    seed

    Random seed

    returns

    Java List of input.

    Annotations
    @Since( "0.8.0" )
  13. def generateLinearRDD(sc: SparkContext, nexamples: Int, nfeatures: Int, eps: Double, nparts: Int = 2, intercept: Double = 0.0): RDD[LabeledPoint]

    Permalink

    Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and unregularized variants.

    Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and unregularized variants.

    sc

    SparkContext to be used for generating the RDD.

    nexamples

    Number of examples that will be contained in the RDD.

    nfeatures

    Number of features to generate for each example.

    eps

    Epsilon factor by which examples are scaled.

    nparts

    Number of partitions in the RDD. Default value is 2.

    returns

    RDD of LabeledPoint containing sample data.

    Annotations
    @Since( "0.8.0" )
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. def main(args: Array[String]): Unit

    Permalink
    Annotations
    @Since( "0.8.0" )
  18. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  22. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  23. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped