public class StreamingTest
extends Object
implements org.apache.spark.internal.Logging, scala.Serializable
To address novelty affects, the peacePeriod
specifies a set number of initial
RDD
batches of the DStream
to be dropped from significance testing.
The windowSize
sets the number of batches each significance test is to be performed over. The
window is sliding with a stride length of 1 batch. Setting windowSize to 0 will perform
cumulative processing, using all batches seen so far.
Different tests may be used for assessing statistical significance depending on assumptions
satisfied by data. For more details, see StreamingTestMethod
. The testMethod
specifies
which test will be used.
Use a builder pattern to construct a streaming test in an application, for example:
val model = new StreamingTest()
.setPeacePeriod(10)
.setWindowSize(0)
.setTestMethod("welch")
.registerStream(DStream)
Constructor and Description |
---|
StreamingTest() |
Modifier and Type | Method and Description |
---|---|
DStream<org.apache.spark.mllib.stat.test.StreamingTestResult> |
registerStream(DStream<BinarySample> data)
Register a
DStream of values for significance testing. |
JavaDStream<org.apache.spark.mllib.stat.test.StreamingTestResult> |
registerStream(JavaDStream<BinarySample> data)
Register a
JavaDStream of values for significance testing. |
StreamingTest |
setPeacePeriod(int peacePeriod)
Set the number of initial batches to ignore.
|
StreamingTest |
setTestMethod(String method)
Set the statistical method used for significance testing.
|
StreamingTest |
setWindowSize(int windowSize)
Set the number of batches to compute significance tests over.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public DStream<org.apache.spark.mllib.stat.test.StreamingTestResult> registerStream(DStream<BinarySample> data)
DStream
of values for significance testing.
data
- stream of BinarySample(key,value) pairs where the key denotes group membership
(true = experiment, false = control) and the value is the numerical metric to
test for significancepublic JavaDStream<org.apache.spark.mllib.stat.test.StreamingTestResult> registerStream(JavaDStream<BinarySample> data)
JavaDStream
of values for significance testing.
data
- stream of BinarySample(isExperiment,value) pairs where the isExperiment denotes
group (true = experiment, false = control) and the value is the numerical metric
to test for significancepublic StreamingTest setPeacePeriod(int peacePeriod)
public StreamingTest setTestMethod(String method)
public StreamingTest setWindowSize(int windowSize)
windowSize
- (undocumented)