Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets
Authors: Rob Cornish, Paul Vanetti, Alexandre Bouchard-Cote, George Deligiannidis, Arnaud Doucet
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experimental Results In this section we apply SMH to Bayesian logistic regression. A full description of the model and upper bounds (12) we used is given in Section G.2 of the Supplement. We also provide there an additional application our method to robust linear regression. We chose these models due to the availability of lower bounds on the likelihoods required by Firefly. In our experiments we took d = 10. For both SMH-1 and SMH-2 we used truncation as described in Section 2.3, with R = n. Our estimate of the mode bθ was computed using stochastic gradient descent. We compare our algorithms to standard MH, Firefly, and Zig-Zag (Bierkens et al., 2019), which all have the exact posterior as the invariant distribution. We used the MAP-tuned variant of Firefly (which also makes use of bθ) with implicit sampling (this uses an algorithmic parameter qd b = 10 3; the optimal choice of qd b is an open question). Figure 1 (in Section 1) shows the average number of likelihood evaluations per step and confirms the predictions of Theorem 3.1. Figure 2 displays the effective sample sizes (ESS) for the posterior mean estimate of one regression coefficient, rescaled by execution time. For large n, SMH2 significantly outperforms competing techniques. |
| Researcher Affiliation | Academia | 1University of Oxford, Oxford, United Kingdom 2University of British Columbia, Vancouver, Canada 3The Alan Turing Institute, London, United Kingdom. Correspondence to: Rob Cornish <rcornish@robots.ox.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Efficient implementation of the FMH kernel. |
| Open Source Code | Yes | Code to reproduce our experiments is available at github. com/pjcv/smh. |
| Open Datasets | No | The paper mentions 'Bayesian logistic regression' and 'robust linear regression' but does not specify a publicly available dataset by name, link, DOI, or formal citation in the main text. |
| Dataset Splits | No | The paper does not provide specific details about training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments (e.g., CPU, GPU models, memory, or cloud instances). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In our experiments we took d = 10. For both SMH-1 and SMH-2 we used truncation as described in Section 2.3, with R = n. Our estimate of the mode bθ was computed using stochastic gradient descent. We used the MAP-tuned variant of Firefly (which also makes use of bθ) with implicit sampling (this uses an algorithmic parameter qd b = 10 3; the optimal choice of qd b is an open question). For all methods except Zig-Zag we used the proposal (16) with σ = 1, which automatically scales according to the concentration of the target. |