Adaptive Bayesian Sampling with Monte Carlo EM
Authors: Anirban Roychowdhury, Srinivasan Parthasarathy
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In both our synthetic experiments and a high dimensional topic modeling problem with a complex Bayesian nonparametric construction [14], our samplers match or beat the Riemannian variants in sampling efficiency and accuracy, while being close to an order of magnitude faster. We show the RMSE numbers collected from post-burn-in samples as well as per-iteration runtimes in Table 1. |
| Researcher Affiliation | Academia | Anirban Roychowdhury, Srinivasan Parthasarathy Department of Computer Science and Engineering The Ohio State University roychowdhury.7@osu.edu, srini@cse.ohio-state.edu |
| Pseudocode | Yes | Algorithm 1 HMC-EM; Algorithm 2 SGNHT-EM |
| Open Source Code | No | The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use count matrices from the 20-Newsgroups and Reuters Corpus Volume 1 corpora [33]. The Australian credit dataset contains 690 datapoints of dimensionality 14, and the Heart dataset has 270 13-dimensional datapoints. |
| Dataset Splits | No | The paper mentions "We used a chronological 60 40 train-test split for both datasets" but does not specify a separate validation split for the experiments conducted in this paper. |
| Hardware Specification | No | The paper does not specify any hardware details such as CPU, GPU models, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions). |
| Experiment Setup | Yes | Batch sizes were fixed to 100 for all the stochastic algorithms, along with 10 leapfrog iterations across the board. For HMC we used a fairly high learning rate of 1e 2. For SGHMC and SGNHT we used A = 10 and A = 1 respectively. For SGR-NPHMC we used A, B = 0.01. Learning rates were chosen from {1e 2, 1e 4, 1e 6}, and values of the stochastic noise terms were selected from {0.001, 0.01, 0.1, 1, 10}. Leapfrog steps were chosen from {10, 20, 30}. We used initialized S_count to 300 for HMCEM, SGHMC-EM, and SGNHT-EM, and 200 for SG-NPHMC-EM. |