Variational Inference with Coverage Guarantees in Simulation-Based Inference

Authors: Yash Patel, Declan Mcnamara, Jackson Loper, Jeffrey Regier, Ambuj Tewari

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we demonstrate the accurate calibration and high predictive efficiency of CANVI on a suite of simulation-based inference benchmark tasks and an important scientific task: analyzing galaxy emission spectra.
Researcher Affiliation Academia 1Department of Statistics, University of Michigan, Ann Arbor, USA. Correspondence to: Yash Patel <yppatel@umich.edu>.
Pseudocode Yes The full CANVI framework is provided in Algorithm 1.
Open Source Code Yes Details are provided in Appendix G, and code is available at https: //github.com/yashpatel5400/canvi.git.
Open Datasets Yes We evaluate on the standard SBI benchmark tasks, highlighted in (Delaunoy et al., 2023). For full descriptions of the tasks, refer to Appendix F. The benchmark tasks are a subset of those provided by (Lueckmann et al., 2021). The PROVABGS emulator (Section 4.3) was trained to minimize the MSE using normalized simulated PROVABGS outputs with fixed log stellar mass parameter (Hahn et al., 2023).
Dataset Splits Yes CANVI was applied to an NPE, in which DC was taken to be 10% of the simulation budgets and the remainder used for training. We take DR to be the same size as DC, i.e. |DR| = NC. Algorithm 1: DC, DR P(X, Θ), DT P(X).
Hardware Specification Yes Training these models required between 10 minutes and two hours using an Nvidia RTX 2080 Ti GPUs for each of the SBI tasks.
Software Dependencies No The paper mentions "Py Torch (Paszke et al., 2019)", "Neural Spline Flow architecture", "Adam (Kingma & Ba, 2014)", and "nflows: normalizing flows in Py Torch, November 2020a". While these indicate the software used, specific version numbers (e.g., PyTorch 1.x.x) are not explicitly provided.
Experiment Setup Yes Optimization was done using Adam (Kingma & Ba, 2014) with a learning rate of 10 3 over 5,000 training steps. Specific architecture hyperparameter choices were taken to be the defaults from (Durkan et al., 2020a) and are available in the code. All three methods were trained for 10,000 steps using the Adam optimizer with learning rate 0.0001.