Markovian Score Climbing: Variational Inference with KL(p||q)

Authors: Christian Naesseth, Fredrik Lindsten, David Blei

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In empirical studies, we demonstrate the convergence properties and advantages of MSC. First, we illustrate the systematic errors of the biased methods and how MSC differs on a toy skew-normal model. Then we compare MSC with expectation propagation (EP) and importance sampling (IS)-based optimization [7, 19] on a Bayesian probit classification example with benchmark data. Finally, we apply MSC and SMC-based optimization [22] to fit a stochastic volatility model on exchange rate data.
Researcher Affiliation Academia Christian A. Naesseth Columbia University, USA christian.a.naesseth@columbia.edu Fredrik Lindsten Linköping University, Sweden fredrik.lindsten@liu.se David Blei Columbia University, USA david.blei@columbia.edu
Pseudocode Yes Algorithm 1: Markovian Score Climbing, Algorithm 2: Conditional Importance Sampling, Algorithm 3: Markovian Score Climbing with ML
Open Source Code Yes Code is available at github.com/blei-lab/markovian-score-climbing.
Open Datasets Yes We apply the model for prediction in several UCI datasets [18]. ... [18] D. Dua and C. Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.
Dataset Splits Yes The results where generated by splitting each dataset 100 times into 90% training and 10% test data, then computing average prediction error and its standard deviation.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For SGD methods we use adaptive step-sizes [29]. ... We learn the model and variational parameters using S = 10 particles for both methods, and estimate the log-marginal likelihood after convergence using S = 10,000.