Efficient Training of LDA on a GPU by Mean-for-Mode Estimation
Authors: Jean-Baptiste Tristan, Joseph Tassarotti, Guy Steele
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have run a series of experiments which show that in practice, Mean-for-Mode estimation converges in fewer samples than standard uncollapsed Gibbs sampling. In these experiments, we observed how the log-likelihood of LDA evolves with the number of samples. Figure 3 presents the results of one of our experiments, run on a subset of Wikipedia... We present the resulting benchmarks in Figures 4 and 5 to show how the gap between the GPU algorithms runtimes and that of a collapsed Gibbs sampler scales. |
| Researcher Affiliation | Collaboration | Jean-Baptiste Tristan JEAN.BAPTISTE.TRISTAN@ORACLE.COM Oracle Labs, USA Joseph Tassarotti JTASSARO@CS.CMU.EDU Department of Computer Science, Carnegie Mellon University, USA Guy L. Steele Jr. GUY.STEELE@ORACLE.COM Oracle Labs, USA |
| Pseudocode | Yes | Algorithm 1 Drawing the latent variables Algorithm 2 Estimation of the φ variables Algorithm 3 LDA sampler using both sparse and dense matrices |
| Open Source Code | No | The paper does not include an explicit statement about open-sourcing the code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | on a subset of Wikipedia (50,000 documents, 3,000,000 tokens, 40,000 vocabulary words) |
| Dataset Splits | No | The paper mentions using a Wikipedia subset of specific size and varying number of documents/topics for experiments, but it does not provide specific training/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | Yes | We implemented this algorithm on an NVIDIA Titan Black, as well as the uncollapsed Gibbs sampler. We also implemented a collapsed Gibbs sampler for comparison, on an Intel i7-4820K CPU. |
| Software Dependencies | No | The paper discusses the implementation on GPUs and CPUs but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | using 20 topics, and both α and β equal to 0.1. ... The experiment was run 10 times with a varying seed for the random number generator. ... The number of initial iterations that are done using dense probability matrices corresponds to the parameter D. |