Consistency Models for Scalable and Fast Simulation-Based Inference

Authors: Marvin Schmitt, Valentin Pratz, Ullrich Köthe, Paul-Christian Bürkner, Stefan Radev

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluation demonstrates that CMPE not only outperforms current state-of-the-art algorithms on hard low-dimensional benchmarks, but also achieves competitive performance with much faster sampling speed on two realistic estimation problems with high data and/or parameter dimensions.
Researcher Affiliation Academia Marvin Schmitt University of Stuttgart Germany mail.marvinschmitt@gmail.com Valentin Pratz Heidelberg University & ELIZA Germany Ullrich K othe Heidelberg University Germany Paul-Christian B urkner TU Dortmund University Germany Stefan T. Radev Rensselaer Polytechnic Institute United States
Pseudocode No The paper describes algorithms and sampling procedures in textual paragraphs (e.g., in Section 3.1 'Sampling'), but it does not include explicitly labeled 'Algorithm' or 'Pseudocode' blocks or figures.
Open Source Code Yes The software code for the experiments is available in a public Git Hub repository: https://github.com/bayesflow-org/consistency-model-posterior-estimation
Open Datasets Yes This experiment demonstrates the feasibility of CMPE for a high-dimensional inverse problem, namely, Bayesian denoising on the Fashion MNIST data set [61].
Dataset Splits No The simulation-based training phase is based on a fixed training set {(x(m), θ(m) )}M m=1 of M tuples of data sets x(m) and corresponding ground-truth parameters θ(m) . All metrics are computed on a test set of J unseen instances {(x(j), θ(j) )}J j=1. While the paper mentions using a 'validation set' for hyperparameter tuning in a discussion of limitations, it does not explicitly specify a validation dataset split for the reported experiments.
Hardware Specification Yes For neural network training, we used a Mac M1 CPU for Experiments 1 3, an NVIDIA V100 GPU for Experiment 4, an NVIDIA RTX 4090 GPU for Experiment 5, and an NVIDIA H100 GPU for Experiment 6. The evaluation scripts were executed on a Mac M1 CPU for Experiments 1 2, on an NVIDIA V100 GPU for Experiments 3 4, on an NVIDIA RTX 4090 GPU for Experiment 5, and on an NVIDIA V100 GPU for Experiment 6.
Software Dependencies No We implement all experiments using the Bayes Flow Python library for amortized Bayesian workflows [55]. Users can choose between a Py Torch, Tensor Flow, or JAX backend. We use an Adam W optimizer. While software components are mentioned, specific version numbers for libraries like PyTorch, TensorFlow, JAX, or the AdamW optimizer are not provided.
Experiment Setup Yes CMPE relies on an MLP with 2 hidden layers of 256 units each, L2 regularization with weight 10 4, 10% dropout and an initial learning rate of 10 4. The consistency model is instantiated with the hyperparameters s0 = 10, s1 = 1280, Tmax = 1. Training is based on 2000 epochs with batch size 64.