Estimating Convergence of Markov chains with L-Lag Couplings

Authors: Niloy Biswas, Pierre E. Jacob, Paul Vanetti

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate numerically that the bounds provide a practical assessment of convergence for various popular MCMC algorithms, on either discrete or continuous and possibly high-dimensional spaces. In Section 3 we consider applications including Gibbs samplers on the Ising model and gradient-based MCMC algorithms on log-concave targets.
Researcher Affiliation Academia Niloy Biswas Harvard University niloy_biswas@g.harvard.edu; Pierre E. Jacob Harvard University pjacob@fas.harvard.edu; Paul Vanetti University of Oxford paul.vanetti@spc.ox.ac.uk
Pseudocode Yes Algorithm 1: Sampling L-lag meeting times; Algorithm 2: A maximal coupling of p and q
Open Source Code Yes All scripts in R are available at https://github.com/niloyb/Llag Couplings.
Open Datasets Yes Consider the German Credit data from [34]. There are n = 1000 binary responses (Yi)n i=1 { 1, 1}n indicating whether individuals are creditworthy or not creditworthy, and d = 49 covariates xi Rd for each individual i. [34] Moshe Lichman. UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml.
Dataset Splits No The paper describes experiments comparing different MCMC algorithms and validating the proposed bounds against exact distances or theoretical work. However, it does not specify explicit training/validation/test dataset splits as these are not typically defined for MCMC convergence studies.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies Yes All scripts in R are available at https://github.com/niloyb/Llag Couplings. The figures were created with packages [58, 57] in R Core Team [45]. [58] Claus O. Wilke. ggridges: Ridgeline plots in ggplot2 . R package version 0.4, 1, 2017.
Experiment Setup Yes We set the initial distribution π0 to be a point mass at 10. We use L = 1 and L = 150. For each L, N = 10000 independent runs of Algorithm 1 were performed to estimate the bounds in Theorem 2.5 by empirical averages. (from Section 2.2.1) For β = 0.46, we obtain TV bounds for SSG using a lag L = 106, and N = 500 independent repeats. For PT we use 12 chains, each targeting πβ with β in an equispaced grid ranging from 0.3 to 0.46, a frequency of swap moves of 0.02, and a lag L = 2 104. (from Section 3.1)