VISA: Variational Inference with Sequential Sample-Average Approximations

Authors: Heiko Zimmermann, Christian Andersson Naesseth, Jan-Willem van de Meent

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments on high-dimensional Gaussians, Lotka-Volterra dynamics, and a Pickover attractor. We demonstrate that VISA can achieve comparable approximation accuracy to standard importanceweighted forward-KL variational inference while requirering significantly fewer samples for conservatively chosen learning rates.
Researcher Affiliation Academia Amsterdam Machine Learning Lab University of Amsterdam Amsterdam, The Netherlands
Pseudocode Yes Algorithm 1 VISA Input: Initial param. ϕ0, trust region threshold α, data y
Open Source Code Yes Python code for the experiments is available on https://github.com/zmheiko/visa
Open Datasets No The paper evaluates VISA on mathematical models such as "high-dimensional Gaussians, Lotka-Volterra dynamics, and a Pickover attractor" for which data is simulated or generated rather than using publicly available or open datasets with specified access information.
Dataset Splits No The paper mentions evaluating performance and a "test loss value" but does not explicitly specify training, validation, and test dataset splits (e.g., in percentages or sample counts) needed to reproduce data partitioning.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using Adam as an optimizer and Optax, but it does not provide specific version numbers for these or any other software components or libraries, which would be necessary for full reproducibility.
Experiment Setup Yes For all experiments we use Adam (Kingma & Ba, 2015) as an optimizer with the learning rates as indicated in the experiments. Figure 2 shows the results for different learning rates lr {0.001, 0.005, 0.01, 0.05} and ESS thresholds α {0.3, 0.6, 0.9, 0.99}. We compute gradient estimates for VISA, IWFVI and BBVI-SF with N = 10 samples, and gradient estimates for BBVI-RP using a single sample.