EigenVI: score-based variational inference with orthogonal function expansions

Authors: Diana Cai, Chirag Modi, Charles Margossian, Robert Gower, David Blei, Lawrence Saul

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We use Eigen VI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb. On these distributions, we find that Eigen VI is more accurate than existing methods for Gaussian BBVI.
Researcher Affiliation Collaboration Diana Cai Flatiron Institute dcai@flatironinstitute.org Chirag Modi Flatiron Institute cmodi@flatironinstitute.org Charles C. Margossian Flatiron Institute cmargossian@flatironinstitute.org Robert M. Gower Flatiron Institute rgower@flatironinstitute.org David M. Blei Columbia University david.blei@columbia.edu Lawrence K. Saul Flatiron Institute lsaul@flatironinstitute.org
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We provide a Julia implementation of Eigen VI at https://github.com/ dicai/eigen VI and a demonstration on several examples.
Open Datasets Yes We use Eigen VI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb.
Dataset Splits No The paper does not explicitly provide specific train/validation/test dataset splits or mention a splitting methodology. While standard benchmarks like posteriordb might imply typical splits, the paper itself does not specify them.
Hardware Specification Yes The experiments were run on a Linux workstation with a 32-core Intel(R) Xeon(R) w5-3435X processor and with 503 GB of memory. Experiments were run on CPU.
Software Dependencies No The paper mentions using 'off-the-shelf eigenvalue solvers, such as ARPACK [30] or Julia s eigenvalue decomposition function, eigen' but does not specify exact version numbers for Julia or ARPACK.
Experiment Setup Yes For all experiments, we used a proposal distribution π that was uniform on [ 5, 5]2... For the Gaussian score matching (GSM) method [37], we chose a batch size of 16 for all experiments... For the batch and match (Ba M) method [6], we chose a batch size of 16. The learning rate was fixed at λt = BDt+1... For all ELBO optimization methods (full covariance Gaussian family and normalizing flow family), we used Adam to optimize the ELBO. We performed a grid search over the learning rate 0.01, 0.02, 0.05, 0.1 and batch size B = 4, 8, 16, 32. For the normalizing flow model, we used a real NVP [12] with 8 layers and 32 neurons.