Semi-Implicit Variational Inference

Authors: Mingzhang Yin, Mingyuan Zhou

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We implement SIVI in Tensorflow (Abadi et al., 2015) for a range of inference tasks. ... The toy examples show SIVI captures skewness, kurtosis, and multimodality. A negative binomial model shows SIVI can accurately capture the dependencies between latent variables. ... With Bayesian logistic regression, we demonstrate that SIVI can either work alone as a black-box inference procedure for correlated latent variables, or directly expand MFVI by adding a mixing distribution, leading to accurate uncertainty estimation on par with that of MCMC.
Researcher Affiliation Academia 1Department of Statistics and Data Sciences, 2Department of IROM, Mc Combs School of Business, The University of Texas at Austin, Austin TX 78712, USA.
Pseudocode Yes we describe the stochastic gradient ascent algorithm to optimize the variational parameter in Algorithm 1
Open Source Code Yes Code is provided at https://github.com/mingzhang-yin/SIVI
Open Datasets Yes We consider the MNIST dataset that is stochastically binarized as in Salakhutdinov & Murray (2008). ... We apply Gibbs sampling, MFVI, and SIVI to a real overdispersed count dataset of Bliss & Fisher (1953)...
Dataset Splits No The paper states 'We use 55,000 for training and use the 10,000 observations in the testing set for performance evaluation' for the MNIST dataset, but does not explicitly provide details for a validation split.
Hardware Specification Yes On waveform, the algorithm converges in about 500 iterations, which takes about 40 seconds on a 2.4 GHz CPU.
Software Dependencies No The paper states 'We implement SIVI in Tensorflow (Abadi et al., 2015)' but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup Yes We fix σ20 = 0.1 and optimize the implicit layer to minimize KL...; With K = 1000...; We set K = 200 for SIVI.; is set as 0.01.; M = 3 stochastic layers