f-Divergence Variational Inference
Authors: Neng Wan, Dapeng Li, NAIRA HOVAKIMYAN
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical examples, including variational autoencoders and Bayesian neural networks, are provided to demonstrate the effectiveness and the wide applicability of f-VI. |
| Researcher Affiliation | Collaboration | Neng Wan1 nengwan2@illinois.edu Dapeng Li2 dapeng.ustc@gmail.com Naira Hovakimyan1 nhovakim@illinois.edu 1 University of Illinois at Urbana-Champaign, Urbana, IL 61801 2 Anker Innovations, Shenzhen, China |
| Pseudocode | No | A reference black-box f-VI algorithm and the optimization schemes for a few concrete divergences are given in the SM. ... A reference mean-field VI algorithm along with a concrete realization example under KL divergence is provided in the SM. |
| Open Source Code | No | The paper does not provide a direct link to the source code for the methodology or explicitly state that the code is publicly released in the main text. |
| Open Datasets | Yes | The linear regression is performed with twelve datasets from the UCI Machine Learning Repository [36]. ... Bayesian VAE for image reconstruction and generation on the datasets of Caltech 101 Silhouettes [37], Frey Face [38], MNIST [39], and Omniglot [40]. |
| Dataset Splits | Yes | Each dataset is randomly split into 90%/10% for training and testing |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | Adam optimizer with recommended parameters in [35] is employed for stochastic optimization, if not specified. (Note: No version number for Adam is provided.) |
| Experiment Setup | Yes | Adam optimizer with recommended parameters in [35] is employed for stochastic optimization, if not specified. ... The IW-reparameterization gradient (14) with L = 3 and K = 1000 is adopted for the training on a dataset of 500 observations... The IW-reparameterization gradient with L = 5, K = 50 and mini-batch size of 32 is employed for training. After 20 trials with 500 training epochs in each trial... The reparameterization gradient with K = 3, L = 1 is used for training. After 20 trials with 200 training epochs in each trial... |