LinearDataGenerator

object LinearDataGenerator

Generate sample data used for Linear Data. This class generates uniformly random values for every feature and adds Gaussian noise with mean eps to the response variable Y.

Annotations: @Since( "0.8.0" )
Source: LinearDataGenerator.scala

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

LinearDataGenerator
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native() @IntrinsicCandidate()
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double, sparsity: Double): Seq[LabeledPoint]
intercept
Data intercept
weights
Weights to be applied.
xMean
the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.
xVariance
the variance of the generated features.
nPoints
Number of points in sample.
seed
Random seed
eps
Epsilon scaling factor.
sparsity
The ratio of zero elements. If it is 0.0, LabeledPoints with DenseVector is returned.
returns
Seq of input.

Annotations
@Since( "1.6.0" )
def generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double): Seq[LabeledPoint]
intercept
Data intercept
weights
Weights to be applied.
xMean
the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.
xVariance
the variance of the generated features.
nPoints
Number of points in sample.
seed
Random seed
eps
Epsilon scaling factor.
returns
Seq of input.

Annotations
@Since( "0.8.0" )
def generateLinearInput(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double = 0.1): Seq[LabeledPoint]
For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)² / 12 which will be (1.0/3.0)
For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [-1, 1] with uniform distribution, and the variance of uniform distribution is (b - a)² / 12 which will be (1.0/3.0)
intercept
Data intercept
weights
Weights to be applied.
nPoints
Number of points in sample.
seed
Random seed
eps
Epsilon scaling factor.
returns
Seq of input.

Annotations
@Since( "0.8.0" )
def generateLinearInputAsList(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double): List[LabeledPoint]
Return a Java List of synthetic data randomly generated according to a multi collinear model.
Return a Java List of synthetic data randomly generated according to a multi collinear model.
intercept
Data intercept
weights
Weights to be applied.
nPoints
Number of points in sample.
seed
Random seed
returns
Java List of input.

Annotations
@Since( "0.8.0" )
def generateLinearRDD(sc: SparkContext, nexamples: Int, nfeatures: Int, eps: Double, nparts: Int = 2, intercept: Double = 0.0): RDD[LabeledPoint]
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and unregularized variants.
Generate an RDD containing sample data for Linear Regression models - including Ridge, Lasso, and unregularized variants.
sc
SparkContext to be used for generating the RDD.
nexamples
Number of examples that will be contained in the RDD.
nfeatures
Number of features to generate for each example.
eps
Epsilon factor by which examples are scaled.
nparts
Number of partitions in the RDD. Default value is 2.
returns
RDD of LabeledPoint containing sample data.

Annotations
@Since( "0.8.0" )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native() @IntrinsicCandidate()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native() @IntrinsicCandidate()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def main(args: Array[String]): Unit

Annotations
@Since( "0.8.0" )
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native() @IntrinsicCandidate()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native() @IntrinsicCandidate()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Deprecated Value Members

def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] ) @Deprecated
Deprecated

Packages

LinearDataGenerator

object LinearDataGenerator

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

LinearDataGenerator 

object LinearDataGenerator

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

LinearDataGenerator