Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Reverse-Annealed Sequential Monte Carlo for Efficient Bayesian Optimal Experiment Design

Authors: Jake Callahan, Andrew Chin, Jason Pacheco, Tommie Catanach

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate this estimator on a coupled spring-mass system, Johnson-Cook model of plastic deformation, and a sequential source location problem all with multimodal posteriors. Not only do we show that traditional SMC estimators can be used with an order of magnitude fewer particles, but also that our reverse estimator provides a further fourfold improvement in computational cost. Section 5 Experimental Results
Researcher Affiliation	Collaboration	Jake Callahan Program in Applied Mathematics The University of Arizona Andrew Chin Department of Biostatistics Johns Hopkins University Jason Pacheco Department of Computer Science The University of Arizona Tommie Catanach Computational Data Science Sandia National Laboratories
Pseudocode	Yes	Algorithm 1 Reverse annealed sequential Monte Carlo Require: Dataset (θ , y), Temperatures t = {t0, . . . , t N} Ensure: Output ˆIG(y)
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justiﬁcation: We do not include external code or data in this submission but all data are synthetic and fully speciﬁed in the paper, and our algorithmic procedures are described in enough detail to reproduce the results.
Open Datasets	No	Question: Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justiﬁcation: We do not include external code or data in this submission but all data are synthetic and fully speciﬁed in the paper, and our algorithmic procedures are described in enough detail to reproduce the results.
Dataset Splits	No	In BOED we simulate the data and thus know the true underlying parameters. First, in BOED we simulate the data and thus know the true underlying parameters. We consider ten equally spaced designs γ between 0 and 2 of the form 0.2j, j = 1, . . . , 10. By 0 10 20 30 40 50 Spring position 0 10 20 30 40 50 Time (t) 0 10 20 30 40 50 True position x1 Observed value y Figure 2: Three examples of observed data y in orange, along with the true signal x in blue for the spring-mass model. The true signal corresponds to the position of the ﬁrst mass m1 between time t = 0 and t = 50, and is observed over 100 equally-spaced time points. 1000 datasets are drawn for each method, with performance measured by the number of likelihood evaluations required and how well the ﬁnal EIG values correspond to those of an expensive forward SMC run.
Hardware Specification	No	Likelihood evaluations are the dominant cost and are therefore used as a hardware agnostic metric for computational efﬁciency.
Software Dependencies	No	The paper mentions various algorithms and methods such as "sequential Monte Carlo (SMC)", "Markov chain Monte Carlo (MCMC)", "simple random walk Metropolis", and refers to prior work like "Catanach & Beck (2018)" for an SMC algorithm. However, it does not specify any particular software libraries, packages, or programming languages with their version numbers that were used for implementation.
Experiment Setup	Yes	For our backward estimator, we use a ﬁxed tempering sequence based on Calderhead & Girolami (2009), with N = 100 levels and temperatures ti = (i/N)5 (see Corollary A.1). For the MCMC kernel we use a simple random walk Metropolis, where the proposal standard deviation is adapted based on the previous temperature s acceptance rate using the feedback controller of Catanach (2017) to target an ideal acceptance rate of 0.234 (Gelman et al., 1997). Based on an analysis shown in Appendix A.2.1, we ﬁx the number of tempering levels at 100 and only adjust the number of MCMC iterations per temperature for simplicity. We implement an early stopping criterion for MCMC at each level once the Spearman correlation between the starting log likelihoods and the current log likelihoods drops below 0.1. We also halve the number of steps taken once the proposal standard deviation equals the prior standard deviation, indicating that the power posterior is diffuse enough such that more iterations are not critical. For tuning, we ﬁx a design and run multiple MCMC iterations, stopping roughly when our estimates stabilize while accounting for Monte Carlo standard error, which ended up being 60 iterations.