Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Leveraging Self-Consistency for Data-Efficient Amortized Bayesian Inference
Authors: Marvin Schmitt, Desi R. Ivanova, Daniel Habermann, Ullrich Koethe, Paul-Christian Bürkner, Stefan T. Radev
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Empirical Evaluation We evaluate our self-consistent estimator across a range of synthetic tasks and real-world problems. |
| Researcher Affiliation | Academia | 1University of Stuttgart, Germany 2University of Oxford, UK 3TU Dortmund University, Germany 4Heidelberg University, Germany 5Rensselaer Polytechnic Institute, USA. |
| Pseudocode | Yes | Algorithm 1 Self-consistency loss for finite training. {I}: likelihood-based with analytic likelihood {II}: simulation-based with approximate likelihood |
| Open Source Code | Yes | Code Availability We provide reproducible code in the open repository at https://github.com/marvinschmitt/ self-consistency-abi |
| Open Datasets | Yes | As a scientific real-world example, we apply our method to an experimental data set in biology (Silk et al., 2011). |
| Dataset Splits | No | No specific details on dataset split percentages, counts, or methodology for training/validation/test sets were provided. While 'posterior loss on a separate validation dataset' is mentioned in Appendix E, the exact details of this split (e.g., size, methodology) are not specified. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments, such as GPU/CPU models, memory, or cloud instance specifications. |
| Software Dependencies | No | The paper mentions software like 'Stan', 'neural spline flow', 'Deep Set', 'tensorflow_probability', and 'scipy.stats', but does not provide specific version numbers for these software dependencies, only citations to their original papers. |
| Experiment Setup | Yes | The neural networks are trained for a total of 35 epochs with a batch size of 32 and an initial learning rate of 10-3. We choose a stepwise constant annealing schedule for the self-consistency weight λ such that λ = 0 for the first 5 epochs, and λ = 1 for the remaining 30 epochs. |