Class EMLDAOptimizer
Object
org.apache.spark.mllib.clustering.EMLDAOptimizer
- All Implemented Interfaces:
LDAOptimizer
Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters.
Currently, the underlying implementation uses Expectation-Maximization (EM), implemented according to the Asuncion et al. (2009) paper referenced below.
References: - Original LDA paper (journal version): Blei, Ng, and Jordan. "Latent Dirichlet Allocation." JMLR, 2003. - This class implements their "smoothed" LDA model. - Paper which clearly explains several algorithms, including EM: Asuncion, Welling, Smyth, and Teh. "On Smoothing and Inference for Topic Models." UAI, 2009.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionboolean
If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).setKeepLastCheckpoint
(boolean keepLastCheckpoint) If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).
-
Constructor Details
-
EMLDAOptimizer
public EMLDAOptimizer()
-
-
Method Details
-
getKeepLastCheckpoint
public boolean getKeepLastCheckpoint()If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up).- Returns:
- (undocumented)
-
setKeepLastCheckpoint
If using checkpointing, this indicates whether to keep the last checkpoint (vs clean up). Deleting the checkpoint can cause failures if a data partition is lost, so set this bit with care.Default: true
- Parameters:
keepLastCheckpoint
- (undocumented)- Returns:
- (undocumented)
- Note:
- Checkpoints will be cleaned up via reference counting, regardless.
-