VISA: Variational Inference with Sequential Sample-Average Approximations
Authors: Heiko Zimmermann, Christian Andersson Naesseth, Jan-Willem van de Meent
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on high-dimensional Gaussians, Lotka-Volterra dynamics, and a Pickover attractor. We demonstrate that VISA can achieve comparable approximation accuracy to standard importanceweighted forward-KL variational inference while requirering significantly fewer samples for conservatively chosen learning rates. |
| Researcher Affiliation | Academia | Amsterdam Machine Learning Lab University of Amsterdam Amsterdam, The Netherlands |
| Pseudocode | Yes | Algorithm 1 VISA Input: Initial param. ϕ0, trust region threshold α, data y |
| Open Source Code | Yes | Python code for the experiments is available on https://github.com/zmheiko/visa |
| Open Datasets | No | The paper evaluates VISA on mathematical models such as "high-dimensional Gaussians, Lotka-Volterra dynamics, and a Pickover attractor" for which data is simulated or generated rather than using publicly available or open datasets with specified access information. |
| Dataset Splits | No | The paper mentions evaluating performance and a "test loss value" but does not explicitly specify training, validation, and test dataset splits (e.g., in percentages or sample counts) needed to reproduce data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using Adam as an optimizer and Optax, but it does not provide specific version numbers for these or any other software components or libraries, which would be necessary for full reproducibility. |
| Experiment Setup | Yes | For all experiments we use Adam (Kingma & Ba, 2015) as an optimizer with the learning rates as indicated in the experiments. Figure 2 shows the results for different learning rates lr {0.001, 0.005, 0.01, 0.05} and ESS thresholds α {0.3, 0.6, 0.9, 0.99}. We compute gradient estimates for VISA, IWFVI and BBVI-SF with N = 10 samples, and gradient estimates for BBVI-RP using a single sample. |