Kernel Semi-Implicit Variational Inference
Authors: Ziheng Cheng, Longlin Yu, Tianyu Xie, Shiyue Zhang, Cheng Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness and efficiency of KSIVI on both synthetic distributions and a variety of real data Bayesian inference tasks. In this section, we compare KSIVI to the ELBO-based method SIVI and the score-based method SIVI-SM on toy examples and real-data problems. |
| Researcher Affiliation | Academia | 1School of Mathematical Sciences, Peking University, China 2Center for Statistical Science, Peking University, China. |
| Pseudocode | Yes | Algorithm 1 KSIVI with diagonal Gaussian conditional layer and vanilla gradient estimator. The algorithm with U-statistic gradient estimator (16) is deferred to Appendix B. |
| Open Source Code | Yes | More implementation details can be found in Appendix C and https://github.com/longin Yu/KSIVI. |
| Open Datasets | Yes | We first conduct toy experiments on approximating three two-dimensional distributions: BANANA, MULTIMODAL, and X-SHAPED, whose probability density functions are in Table 4 in Appendix C.1. We consider the WAVEFORM1 dataset... https://archive.ics.uci.edu/ml/machine-learning-databases/waveform. across various benchmark UCI datasets2. https://archive.ics.uci.edu/ml/datasets.php |
| Dataset Splits | Yes | The datasets are randomly partitioned into 90% for training and 10% for testing. |
| Hardware Specification | Yes | Table 2 shows the training time per 10,000 iterations of SIVI variants on a 3.2 GHz CPU. |
| Software Dependencies | No | The paper states: "All the experiments are implemented in Py Torch (Paszke et al., 2019).". While PyTorch is named, a specific version number is not provided, nor are any other software dependencies with version numbers. |
| Experiment Setup | Yes | The results are collected after 50,000 iterations with a learning rate of 0.001 for all the methods. The learning rate for variational parameters ϕ is chosen as 0.001 and the batch size of particles is chosen as 100 during the training. For all the SIVI variants, the results are collected after 40,000 parameter updates. For all SIVI variants, we update the variational parameters ϕ for 100,000 iterations to ensure convergence (Appendix C.3). |