Scalable Nonparametric Sampling from Multimodal Posteriors with the Posterior Bootstrap
Authors: Edwin Fong, Simon Lyddon, Chris Holmes
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate this on Gaussian mixture model and sparse logistic regression examples. We compare NPL to conventional Bayesian inference with the No-U-Turn Sampler (NUTS) and Automatic Differentiation Variational Inference (ADVI). We evaluate the predictive performance of each method on held-out test data. |
| Researcher Affiliation | Academia | 1Department of Statistics, University of Oxford, Oxford, United Kingdom 2The Alan Turing Institute, London, United Kingdom. |
| Pseudocode | Yes | Algorithm 1 NPL Posterior Sampling; Algorithm 2 Posterior Bootstrap Sampling; Algorithm 3 RR-NPL Posterior Sampling; Algorithm 4 FI-NPL Posterior Sampling. |
| Open Source Code | Yes | We now demonstrate our method on some examples; the code is available online 1. 1https://github.com/edfong/npl |
| Open Datasets | Yes | We analyze 3 binary classification datasets from the UCI ML repository (Dheeru & Karra Taniskidou, 2017): Adult (Kohavi, 1996), Polish companies bankruptcy 3rd year , (Zikeba et al., 2016), and Arcene (Guyon et al., 2005) with details in Table 3. MNIST (Le Cun & Cortes, 2010). |
| Dataset Splits | Yes | We generate ntrain = 1000 for model fitting and another ntest = 250 held-out for model evaluation with different seeds for each of the 30 runs. We carry out a random stratified train-test split for each of the 30 runs, with 80-20 split for Adult , Polish and 50-50 split for Arcene due to the smaller dataset. |
| Hardware Specification | Yes | All NPL examples are run on 4 Azure F72s v2 (72 v CPUs) virtual machines, implemented in Python. The NUTS and ADVI examples cannot be im-plemented in an embarrassingly parallel manner, so they are run on a single Azure F72s v2. |
| Software Dependencies | No | The paper mentions software like 'Python', 'sklearn.mixture', 'scipy.optimize', and 'Stan' but does not specify exact version numbers for these dependencies. |
| Experiment Setup | Yes | For the Bayesian model we set a0 = 1, and for NPL we set α = 0 as n p. We optimize each bootstrap maximization with a weighted EM algorithm... For RR-NPL, we initialize π Dir(1, . . . , 1), µkj unif( 2, 6) and σ2 kj IG(1, 1) for each restart. For FI-NPL we initialize with one of the posterior modes from RR-NPL. We produce 2000 posterior samples for each method. |