Flow Matching for Scalable Simulation-Based Inference
Authors: Jonas Wildberger, Maximilian Dax, Simon Buchholz, Stephen Green, Jakob H Macke, Bernhard Schölkopf
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a number of experiments to investigate the performance of FMPE.2 Our twopronged approach, which involves a set of benchmark tests and a real-world problem, is designed to probe complementary aspects of the method, covering breadth and depth of applications. First, on an established suite of SBI benchmarks, we show that FMPE performs comparably or better than NPE across most tasks, and in particular exhibits mass-covering posteriors in all cases (Sec. 4). We then push the performance limits of FMPE on a challenging real-world problem by turning to gravitational-wave inference (Sec. 5). |
| Researcher Affiliation | Academia | Jonas Wildberger Max Planck Institute for Intelligent Systems Tübingen, Germany wildberger.jonas@tuebingen.mpg.de Maximilian Dax Max Planck Institute for Intelligent Systems Tübingen, Germany maximilian.dax@tuebingen.mpg.de Simon Buchholz Max Planck Institute for Intelligent Systems Tübingen, Germany sbuchholz@tue.mpg.de Stephen R. Green University of Nottingham Nottingham, United Kingdom Jakob H. Macke Max Planck Institute for Intelligent Systems & Machine Learning in Science, University of Tübingen Tübingen, Germany Bernhard Schölkopf Max Planck Institute for Intelligent Systems Tübingen, Germany |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available here. |
| Open Datasets | Yes | We now evaluate FMPE on ten tasks included in the benchmark presented in [46]... For each task, we train three separate FMPE models with simulation budgets N {103, 104, 105}... We use the data settings described in [7], with a few minor modifications. In particular, we use the waveform model IMRPhenom Pv2 [76 78] and the prior displayed in Tab. 4. |
| Dataset Splits | Yes | We reserve 5% of the simulations for validation. |
| Hardware Specification | Yes | We train the NPE and FMPE networks with 5 106 simulations for 400 epochs using a batch size of 4096 on an A100 GPU. |
| Software Dependencies | No | The paper mentions building on 'public DINGO code' and using 'dopri5 discretization' but does not specify version numbers for these or other software components. |
| Experiment Setup | Yes | We sweep over the batch size and learning rate (which is particularly important as the simulation budgets differ by orders of magnitudes), the network size and the α parameter for the time prior defined in Section 3.3 (see Tab. 2 for the specific values). Table 2: Sweep values for the hyperparamters for the SBI benchmark. hidden dimensions 2n for n {4, . . . , 10} number of blocks 10, . . . , 18 batch size 2n for n {2, . . . , 9} learning rate 1.e-3, 5.e-4, 2.e-4, 1.e-4 α (for time prior) -0.25, -0.5, 0, 1, 4 |