Semi-Implicit Variational Inference
Authors: Mingzhang Yin, Mingyuan Zhou
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implement SIVI in Tensorflow (Abadi et al., 2015) for a range of inference tasks. ... The toy examples show SIVI captures skewness, kurtosis, and multimodality. A negative binomial model shows SIVI can accurately capture the dependencies between latent variables. ... With Bayesian logistic regression, we demonstrate that SIVI can either work alone as a black-box inference procedure for correlated latent variables, or directly expand MFVI by adding a mixing distribution, leading to accurate uncertainty estimation on par with that of MCMC. |
| Researcher Affiliation | Academia | 1Department of Statistics and Data Sciences, 2Department of IROM, Mc Combs School of Business, The University of Texas at Austin, Austin TX 78712, USA. |
| Pseudocode | Yes | we describe the stochastic gradient ascent algorithm to optimize the variational parameter in Algorithm 1 |
| Open Source Code | Yes | Code is provided at https://github.com/mingzhang-yin/SIVI |
| Open Datasets | Yes | We consider the MNIST dataset that is stochastically binarized as in Salakhutdinov & Murray (2008). ... We apply Gibbs sampling, MFVI, and SIVI to a real overdispersed count dataset of Bliss & Fisher (1953)... |
| Dataset Splits | No | The paper states 'We use 55,000 for training and use the 10,000 observations in the testing set for performance evaluation' for the MNIST dataset, but does not explicitly provide details for a validation split. |
| Hardware Specification | Yes | On waveform, the algorithm converges in about 500 iterations, which takes about 40 seconds on a 2.4 GHz CPU. |
| Software Dependencies | No | The paper states 'We implement SIVI in Tensorflow (Abadi et al., 2015)' but does not provide a specific version number for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | We fix σ20 = 0.1 and optimize the implicit layer to minimize KL...; With K = 1000...; We set K = 200 for SIVI.; is set as 0.01.; M = 3 stochastic layers |