AFTAggregator (Spark 2.2.1 JavaDoc)

Object
- org.apache.spark.ml.regression.AFTAggregator

All Implemented Interfaces:

java.io.Serializable
```
public class AFTAggregator
extends Object
implements scala.Serializable
```
AFTAggregator computes the gradient and loss for a AFT loss function, as used in AFT survival regression for samples in sparse or dense vector in an online fashion.
The loss function and likelihood function under the AFT model based on: Lawless, J. F., Statistical Models and Methods for Lifetime Data, New York: John Wiley & Sons, Inc. 2003.
Two AFTAggregator can be merged together to have a summary of loss and gradient of the corresponding joint dataset.
Given the values of the covariates $x^{'}$, for random lifetime $t_{i}$ of subjects i = 1,..,n, with possible right-censoring, the likelihood function under the AFT model is given as

$$ L(\beta,\sigma)=\prod_{i=1}^n[\frac{1}{\sigma}f_{0} (\frac{\log{t_{i}}-x^{'}\beta}{\sigma})]^{\delta_{i}}S_{0} (\frac{\log{t_{i}}-x^{'}\beta}{\sigma})^{1-\delta_{i}} $$

Where $\delta_{i}$ is the indicator of the event has occurred i.e. uncensored or not. Using $\epsilon_{i}=\frac{\log{t_{i}}-x^{'}\beta}{\sigma}$, the log-likelihood function assumes the form

$$ \iota(\beta,\sigma)=\sum_{i=1}^{n}[-\delta_{i}\log\sigma+ \delta_{i}\log{f_{0}}(\epsilon_{i})+(1-\delta_{i})\log{S_{0}(\epsilon_{i})}] $$
Where $S_{0}(\epsilon_{i})$ is the baseline survivor function, and $f_{0}(\epsilon_{i})$ is corresponding density function.
The most commonly used log-linear survival regression method is based on the Weibull distribution of the survival time. The Weibull distribution for lifetime corresponding to extreme value distribution for log of the lifetime, and the $S_{0}(\epsilon)$ function is

$$ S_{0}(\epsilon_{i})=\exp(-e^{\epsilon_{i}}) $$

and the $f_{0}(\epsilon_{i})$ function is

$$ f_{0}(\epsilon_{i})=e^{\epsilon_{i}}\exp(-e^{\epsilon_{i}}) $$

The log-likelihood function for Weibull distribution of lifetime is

$$ \iota(\beta,\sigma)= -\sum_{i=1}^n[\delta_{i}\log\sigma-\delta_{i}\epsilon_{i}+e^{\epsilon_{i}}] $$

Due to minimizing the negative log-likelihood equivalent to maximum a posteriori probability, the loss function we use to optimize is $-\iota(\beta,\sigma)$. The gradient functions for $\beta$ and $\log\sigma$ respectively are

$$ \frac{\partial (-\iota)}{\partial \beta}= \sum_{1=1}^{n}[\delta_{i}-e^{\epsilon_{i}}]\frac{x_{i}}{\sigma} \\
\frac{\partial (-\iota)}{\partial (\log\sigma)}= \sum_{i=1}^{n}[\delta_{i}+(\delta_{i}-e^{\epsilon_{i}})\epsilon_{i}] $$

param: bcParameters The broadcasted value includes three part: The log of scale parameter, the intercept and regression coefficients corresponding to the features. param: fitIntercept Whether to fit an intercept term. param: bcFeaturesStd The broadcast standard deviation values of the features.

See Also:

Serialized Form

Constructor Summary

Constructors
Constructor and Description
`AFTAggregator(Broadcast<breeze.linalg.DenseVector<Object>> bcParameters, boolean fitIntercept, Broadcast<double[]> bcFeaturesStd)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`AFTAggregator`	`add(org.apache.spark.ml.regression.AFTPoint data)` Add a new training data to this AFTAggregator, and update the loss and gradient of the objective function.
`long`	`count()`
`breeze.linalg.DenseVector<Object>`	`gradient()`
`double`	`loss()`
`AFTAggregator`	`merge(AFTAggregator other)` Merge another AFTAggregator, and update the loss and gradient of the objective function.

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - AFTAggregator
```
public AFTAggregator(Broadcast<breeze.linalg.DenseVector<Object>> bcParameters,
                     boolean fitIntercept,
                     Broadcast<double[]> bcFeaturesStd)
```
- Method Detail
  - count
```
public long count()
```
  - loss
```
public double loss()
```
  - gradient
```
public breeze.linalg.DenseVector<Object> gradient()
```
  - add
```
public AFTAggregator add(org.apache.spark.ml.regression.AFTPoint data)
```
    Add a new training data to this AFTAggregator, and update the loss and gradient of the objective function.
    
    Parameters:
    
    data - The AFTPoint representation for one data point to be added into this aggregator.
    
    Returns:
    
    This AFTAggregator object.
  - merge
```
public AFTAggregator merge(AFTAggregator other)
```
    Merge another AFTAggregator, and update the loss and gradient of the objective function. (Note that it's in place merging; as a result, this object will be modified.)
    
    Parameters:
    
    other - The other AFTAggregator to be merged.
    
    Returns:
    
    This AFTAggregator object.

Class AFTAggregator

Constructor Summary

Method Summary

Methods inherited from class Object

Constructor Detail

AFTAggregator

Method Detail

count

loss

gradient

add

merge