Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Balancing Performance and Costs in Best Arm Identification

Authors: Michael Harding, Kirthevasan Kandasamy

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then demonstrate the performance of DBCARE on a number of simulated models, comparing to fixed budget and confidence algorithms to show the shortfalls of existing BAI paradigms on this problem.
Researcher Affiliation	Academia	Michael O. Harding Department of Statistics University of Wisconsin-Madison EMAIL Kirthevasan Kandasamy Department of Computer Science University of Wisconsin-Madison EMAIL
Pseudocode	Yes	Algorithm 1 Dynamically Budgeted Cost-Adapted Risk-minimizing Elimination
Open Source Code	No	Answer: [No] Justification: We do not provide access to the data and code.
Open Datasets	Yes	We present the results of a real data experiment on a drug discovery dataset. For this experiment, we take the results from Table 2 of Genovese et al. [19]
Dataset Splits	No	Results are averaged across 10^5 runs each with different random seeds.
Hardware Specification	Yes	All experiments were performed using a 3.7GHz AMD Ryzen 9 5900X 12-Core processor with 24 GB of memory.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We study the performance across a range of suboptimality gaps for Gaussian and Bernoulli rewards in the two-arm setting using the cost c = 10^-4. In the Gaussian setting, the arms have variance σ^2 = 1 with means ε/2, for ε ∈ [0.05, 2]; for Bernoulli arms, the means are 0.5 ± ε/2, for ε ∈ [0.01, 0.95]. Results are averaged across 10^5 runs each with different random seeds. We compare to Sequential Halving for fixed budget and elimination procedures using the optimized stopping rules of [30] for fixed confidence. We use budgets T = 10 and T = 500 and confidences of δ = 0.1 and δ = 0.01 for comparison against relatively low and high confidence/budget choices.