Compositional Score Modeling for Simulation-Based Inference

Authors: Tomas Geffner, George Papamakarios, Andriy Mnih

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 5 presents a comprehensive empirical evaluation of the proposed approaches on a range of tasks typically used to evaluate SBI methods (Lueckmann et al., 2021). Our results show that our proposed methods tend to outperform relevant baselines when multiple observations are available at inference time, and that the use of methods in the PF-NPSE family often leads to increased robustness.
Researcher Affiliation Collaboration Tomas Geffner 1 2 George Papamakarios 3 Andriy Mnih 3 1Work done during an internship at Deep Mind. 2University of Massachusetts, Amherst. 3Deep Mind. Correspondence to: Tomas Geffner <tgeffner@cs.umass.edu>, Andriy Mnih <andriy@deepmind.com>.
Pseudocode Yes Algorithm 1 Annealed Langevin with learned scores and Algorithm 2 Sampling without unadjusted Langevin Dynamics.
Open Source Code No The paper does not contain any explicit statements about releasing source code for their method or links to a repository.
Open Datasets Yes We now present a systematic evaluation on four tasks typically used to evaluate SBI methods (Lueckmann et al., 2021)... Gaussian/Gaussian (G-G)... Gaussian/Mixture of Gaussians (G-Mo G)... Susceptible-Infected-Recovered (SIR)... Lotka Volterra (LV)... Weinberg simulator... The simulator from Cranmer et al. (2017).
Dataset Splits Yes Unless specified otherwise, each method is given a budget of 104 simulator calls, and optimization is carried out using Adam (Kingma & Ba, 2014) with the learning rate of 10 4 for a maximum of 20k epochs (using 20% of the training data as a validation set for early stopping).
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running experiments.
Software Dependencies No The paper mentions software components like 'Adam', 'Num Pyro', 'Real NVP layers', and 'Layer Norm' along with citations, but does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes Unless specified otherwise, each method is given a budget of 104 simulator calls, and optimization is carried out using Adam (Kingma & Ba, 2014) with the learning rate of 10 4 for a maximum of 20k epochs (using 20% of the training data as a validation set for early stopping). We use L = 5 and δt = 0.3 1 αt αt , where α1 = γ1 and αt = γt γt 1 for t = 2, . . . , T 1. We use nmax = 30 for NPE and NPSE and m = 6 for PF-NPSE.