Markovian Score Climbing: Variational Inference with KL(p||q)
Authors: Christian Naesseth, Fredrik Lindsten, David Blei
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In empirical studies, we demonstrate the convergence properties and advantages of MSC. First, we illustrate the systematic errors of the biased methods and how MSC differs on a toy skew-normal model. Then we compare MSC with expectation propagation (EP) and importance sampling (IS)-based optimization [7, 19] on a Bayesian probit classification example with benchmark data. Finally, we apply MSC and SMC-based optimization [22] to fit a stochastic volatility model on exchange rate data. |
| Researcher Affiliation | Academia | Christian A. Naesseth Columbia University, USA christian.a.naesseth@columbia.edu Fredrik Lindsten Linköping University, Sweden fredrik.lindsten@liu.se David Blei Columbia University, USA david.blei@columbia.edu |
| Pseudocode | Yes | Algorithm 1: Markovian Score Climbing, Algorithm 2: Conditional Importance Sampling, Algorithm 3: Markovian Score Climbing with ML |
| Open Source Code | Yes | Code is available at github.com/blei-lab/markovian-score-climbing. |
| Open Datasets | Yes | We apply the model for prediction in several UCI datasets [18]. ... [18] D. Dua and C. Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci. |
| Dataset Splits | Yes | The results where generated by splitting each dataset 100 times into 90% training and 10% test data, then computing average prediction error and its standard deviation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For SGD methods we use adaptive step-sizes [29]. ... We learn the model and variational parameters using S = 10 particles for both methods, and estimate the log-marginal likelihood after convergence using S = 10,000. |