|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Object org.apache.spark.mllib.clustering.EMLDAOptimizer
public final class EMLDAOptimizer
:: DeveloperApi ::
Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters.
Currently, the underlying implementation uses Expectation-Maximization (EM), implemented according to the Asuncion et al. (2009) paper referenced below.
References: - Original LDA paper (journal version): Blei, Ng, and Jordan. "Latent Dirichlet Allocation." JMLR, 2003. - This class implements their "smoothed" LDA model. - Paper which clearly explains several algorithms, including EM: Asuncion, Welling, Smyth, and Teh. "On Smoothing and Inference for Topic Models." UAI, 2009.
Constructor Summary | |
---|---|
EMLDAOptimizer()
|
Method Summary | |
---|---|
int |
checkpointInterval()
|
double |
docConcentration()
|
breeze.linalg.DenseVector<Object> |
globalTopicTotals()
Aggregate distributions over topics from all term vertices. |
Graph<breeze.linalg.DenseVector<Object>,Object> |
graph()
The following fields will only be initialized through the initialize() method |
int |
k()
|
double |
topicConcentration()
|
int |
vocabSize()
|
Methods inherited from class Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public EMLDAOptimizer()
Method Detail |
---|
public Graph<breeze.linalg.DenseVector<Object>,Object> graph()
public int k()
public int vocabSize()
public double docConcentration()
public double topicConcentration()
public int checkpointInterval()
public breeze.linalg.DenseVector<Object> globalTopicTotals()
Note: This executes an action on the graph RDDs.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |