Optimal Scaling for Locally Balanced Proposals in Discrete Spaces

Authors: Haoran Sun, Hanjun Dai, Dale Schuurmans

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate these theoretical findings in a series of empirical simulations on the Bernoulli model, the Ising model, factorized hidden Markov models (FHMM) and restricted Boltzmann machines (RBM). The experimental outcomes comport with the theory. Moreover, we demonstrate that ALBP can automatically find near optimal scales for these distributions. We also use ALBP to train deep energy based models (EBMs), finding that it reduces the MCMC steps needed in contrastive divergence training (Hinton, 2002; Tieleman & Hinton, 2009), significantly improving the efficiency of the overall training procedure.
Researcher Affiliation Collaboration Haoran Sun Georgia Tech hsun349@gatech.edu Hanjun Dai Google Brain hadai@google.com Dale Schuurmans Google Brain, U of Alberta schuurmans@google.com
Pseudocode Yes Algorithm 1: A M-H step of LBP-R and ALBP
Open Source Code Yes Did you include the license to the code and datasets? [Yes] See https://github.com/ ha0ransun/LBP_Scale.git.
Open Datasets Yes We validate these theoretical findings in a series of empirical simulations on the Bernoulli model, the Ising model, factorized hidden Markov models (FHMM) and restricted Boltzmann machines (RBM). ... We train an RBM on the MNIST dataset using contrastive divergence (Hinton, 2002) and sample observable variables x.
Dataset Splits No The paper describes varying model dimensionalities and sizes for simulations (e.g., N=100, 800, 6400 for Bernoulli; p=20, 50, 100 for Ising), and uses an adaptive sampler to obtain an estimated scale R for performance curves. However, it does not specify explicit training, validation, or test dataset splits with percentages or sample counts for any of the models used.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU models, or cloud computing instance types.
Software Dependencies No The paper mentions "Tensorflow Probability" for ESS computation but does not specify its version number or any other software dependencies with their respective versions.
Experiment Setup Yes For each model, we consider three configurations: C1, C2, and C3 for smooth, moderate, and sharp target distributions. ... For each configuration, we simulate on domains with three dimensionalities: N = 100, 800, 6400. ... We follow common practice and adapt the tunable MCMC parameters during a warmup phase before freezing them thereafter Gelman et al. (2013). The computational cost for (22) is ignorable comparing the total cost of a M-H step. The algorithm boxes for ALBP and ARWM are given in Appendix C.