Consistency Models for Scalable and Fast Simulation-Based Inference
Authors: Marvin Schmitt, Valentin Pratz, Ullrich Köthe, Paul-Christian Bürkner, Stefan Radev
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed on two realistic estimation problems with high data and/or parameter dimensions. |
| Researcher Affiliation | Academia | Marvin Schmitt University of Stuttgart Germany mail.marvinschmitt@gmail.com Valentin Pratz Heidelberg University & ELIZA Germany Ullrich K othe Heidelberg University Germany Paul-Christian B urkner TU Dortmund University Germany Stefan T. Radev Rensselaer Polytechnic Institute United States |
| Pseudocode | No | The paper describes algorithms and sampling procedures in textual paragraphs (e.g., in Section 3.1 'Sampling'), but it does not include explicitly labeled 'Algorithm' or 'Pseudocode' blocks or figures. |
| Open Source Code | Yes | The software code for the experiments is available in a public Git Hub repository: https://github.com/bayesflow-org/consistency-model-posterior-estimation |
| Open Datasets | Yes | This experiment demonstrates the feasibility of CMPE for a high-dimensional inverse problem, namely, Bayesian denoising on the Fashion MNIST data set [61]. |
| Dataset Splits | No | The simulation-based training phase is based on a fixed training set {(x(m), θ(m) )}M m=1 of M tuples of data sets x(m) and corresponding ground-truth parameters θ(m) . All metrics are computed on a test set of J unseen instances {(x(j), θ(j) )}J j=1. While the paper mentions using a 'validation set' for hyperparameter tuning in a discussion of limitations, it does not explicitly specify a validation dataset split for the reported experiments. |
| Hardware Specification | Yes | For neural network training, we used a Mac M1 CPU for Experiments 1 3, an NVIDIA V100 GPU for Experiment 4, an NVIDIA RTX 4090 GPU for Experiment 5, and an NVIDIA H100 GPU for Experiment 6. The evaluation scripts were executed on a Mac M1 CPU for Experiments 1 2, on an NVIDIA V100 GPU for Experiments 3 4, on an NVIDIA RTX 4090 GPU for Experiment 5, and on an NVIDIA V100 GPU for Experiment 6. |
| Software Dependencies | No | We implement all experiments using the Bayes Flow Python library for amortized Bayesian workflows [55]. Users can choose between a Py Torch, Tensor Flow, or JAX backend. We use an Adam W optimizer. While software components are mentioned, specific version numbers for libraries like PyTorch, TensorFlow, JAX, or the AdamW optimizer are not provided. |
| Experiment Setup | Yes | CMPE relies on an MLP with 2 hidden layers of 256 units each, L2 regularization with weight 10 4, 10% dropout and an initial learning rate of 10 4. The consistency model is instantiated with the hyperparameters s0 = 10, s1 = 1280, Tmax = 1. Training is based on 2000 epochs with batch size 64. |