Optimal Scaling for Locally Balanced Proposals in Discrete Spaces
Authors: Haoran Sun, Hanjun Dai, Dale Schuurmans
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate these theoretical findings in a series of empirical simulations on the Bernoulli model, the Ising model, factorized hidden Markov models (FHMM) and restricted Boltzmann machines (RBM). The experimental outcomes comport with the theory. Moreover, we demonstrate that ALBP can automatically find near optimal scales for these distributions. We also use ALBP to train deep energy based models (EBMs), finding that it reduces the MCMC steps needed in contrastive divergence training (Hinton, 2002; Tieleman & Hinton, 2009), significantly improving the efficiency of the overall training procedure. |
| Researcher Affiliation | Collaboration | Haoran Sun Georgia Tech hsun349@gatech.edu Hanjun Dai Google Brain hadai@google.com Dale Schuurmans Google Brain, U of Alberta schuurmans@google.com |
| Pseudocode | Yes | Algorithm 1: A M-H step of LBP-R and ALBP |
| Open Source Code | Yes | Did you include the license to the code and datasets? [Yes] See https://github.com/ ha0ransun/LBP_Scale.git. |
| Open Datasets | Yes | We validate these theoretical findings in a series of empirical simulations on the Bernoulli model, the Ising model, factorized hidden Markov models (FHMM) and restricted Boltzmann machines (RBM). ... We train an RBM on the MNIST dataset using contrastive divergence (Hinton, 2002) and sample observable variables x. |
| Dataset Splits | No | The paper describes varying model dimensionalities and sizes for simulations (e.g., N=100, 800, 6400 for Bernoulli; p=20, 50, 100 for Ising), and uses an adaptive sampler to obtain an estimated scale R for performance curves. However, it does not specify explicit training, validation, or test dataset splits with percentages or sample counts for any of the models used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU models, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions "Tensorflow Probability" for ESS computation but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | For each model, we consider three configurations: C1, C2, and C3 for smooth, moderate, and sharp target distributions. ... For each configuration, we simulate on domains with three dimensionalities: N = 100, 800, 6400. ... We follow common practice and adapt the tunable MCMC parameters during a warmup phase before freezing them thereafter Gelman et al. (2013). The computational cost for (22) is ignorable comparing the total cost of a M-H step. The algorithm boxes for ALBP and ARWM are given in Appendix C. |