Kernel Semi-Implicit Variational Inference

Authors: Ziheng Cheng, Longlin Yu, Tianyu Xie, Shiyue Zhang, Cheng Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness and efficiency of KSIVI on both synthetic distributions and a variety of real data Bayesian inference tasks. In this section, we compare KSIVI to the ELBO-based method SIVI and the score-based method SIVI-SM on toy examples and real-data problems.
Researcher Affiliation Academia 1School of Mathematical Sciences, Peking University, China 2Center for Statistical Science, Peking University, China.
Pseudocode Yes Algorithm 1 KSIVI with diagonal Gaussian conditional layer and vanilla gradient estimator. The algorithm with U-statistic gradient estimator (16) is deferred to Appendix B.
Open Source Code Yes More implementation details can be found in Appendix C and https://github.com/longin Yu/KSIVI.
Open Datasets Yes We first conduct toy experiments on approximating three two-dimensional distributions: BANANA, MULTIMODAL, and X-SHAPED, whose probability density functions are in Table 4 in Appendix C.1. We consider the WAVEFORM1 dataset... https://archive.ics.uci.edu/ml/machine-learning-databases/waveform. across various benchmark UCI datasets2. https://archive.ics.uci.edu/ml/datasets.php
Dataset Splits Yes The datasets are randomly partitioned into 90% for training and 10% for testing.
Hardware Specification Yes Table 2 shows the training time per 10,000 iterations of SIVI variants on a 3.2 GHz CPU.
Software Dependencies No The paper states: "All the experiments are implemented in Py Torch (Paszke et al., 2019).". While PyTorch is named, a specific version number is not provided, nor are any other software dependencies with version numbers.
Experiment Setup Yes The results are collected after 50,000 iterations with a learning rate of 0.001 for all the methods. The learning rate for variational parameters ϕ is chosen as 0.001 and the batch size of particles is chosen as 100 during the training. For all the SIVI variants, the results are collected after 40,000 parameter updates. For all SIVI variants, we update the variational parameters ϕ for 100,000 iterations to ensure convergence (Appendix C.3).