reproducibilityindex.ai

Distributed Stochastic Gradient MCMC

Authors: Sungjin Ahn, Babak Shahbaba, Max Welling

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.
Researcher Affiliation	Academia	Sungjin Ahn SUNGJIA@ICS.UCI.EDU Department of Computer Science, University of California, Irvine Babak Shahbaba BABAKS@UCI.EDU Department of Statistics, University of California, Irvine Max Welling M.WELLING@UVA.NL Machine Learning Group, University of Amsterdam
Pseudocode	Yes	Algorithm 1 D-SGLD Pseudo Code
Open Source Code	No	The paper does not contain an explicit statement about the availability of open-source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets	Yes	We used the same vocabulary of 7702 words as used by Hoffman et al. (2010). (ii) Pub Med Abstract corpus contains 8.2M articles of approximately 730M tokens in total. After removing stopwords and low occurrence (less than 300) words, we obtained a vocabulary of 39,987 words.
Dataset Splits	No	The predictive perplexities were computed on 1000 separate holdout set, with a 90/10 (training/test) split, and LDA s hyper-parameters were set to α = 0.01 and β = 0.0001 following Patterson & Teh (2013). A validation split is not explicitly mentioned.
Hardware Specification	No	The paper mentions 'a cluster of 20 workers' and '20 homogeneous workers' but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions 'For our Python implementation' but does not specify particular software dependencies with version numbers (e.g., Python version, specific libraries like NumPy, SciPy, or machine learning frameworks with their versions).
Experiment Setup	Yes	Following Patterson & Teh (2013), we set the mini-batch size to 50 documents, and for each update of Eqn. (7) we ran 100 Gibbs iterations for each document in the mini-batch. The step-sizes were annealed by a schedule ϵt = a(1+t/b) c. As we fixed b = 1000 and c = 0.6, the entire schedule was set by a which we choose by running parallel chains with different a s and then choosing the best. (...) LDA s hyper-parameters were set to α = 0.01 and β = 0.0001 following Patterson & Teh (2013). The number of topics K was set to 100. (...) we set the trajectory length τ = 10 for all workers.