Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Multi-Armed Bandit Designs for Dose-Finding Trials

Authors: Maryam Aziz, Emilie Kaufmann, Marie-Karelle Riviere

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a large simulation study, we then show that variants of Thompson Sampling based on more sophisticated prior distributions outperform state-of-the-art dose identiﬁcation algorithms in different types of dose-ﬁnding studies that occur in phase I or phase I/II trials. Keywords: Multi-Armed Bandits; Adaptive Clinical Trials; Phase I Clinical Trials; Phase I/II Clinical Trials; Thompson Sampling; Bayesian methods.
Researcher Affiliation	Collaboration	Maryam Aziz EMAIL Spotify NYC, USA Emilie Kaufmann EMAIL Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9189 CRISt AL F-59000 Lille, France Marie-Karelle Riviere EMAIL Statistical Methodology Group, Biostatistics and Programming department, SanoﬁR&D Chilly-Mazarin, France
Pseudocode	Yes	Algorithm 1 Sequential Halving for MTD Identiﬁcation Input: budget n, target toxicity θ Initialization: Set of dose levels S0 {1, . . . , K}; for r 0 to log2(K) 1 do Allocate each dose k Sr to tr = j n \|Sr\| log2(K) k patients; Based on their response compute ˆpr k, the empirical toxicity of dose k based on these tr samples; Compute Sr+1 the set of \|Sr\|/2 arms with smallest ˆdr k := \|θ ˆpr k\| Output: the unique arm in S log2(K)
Open Source Code	No	The paper does not provide explicit links to source code for the methodology described in this paper, nor does it state that the code is available in supplementary materials. It mentions using 'Stan implementation of these Monte-Carlo sampler' but this refers to a third-party tool.
Open Datasets	No	The paper uses 'simulated clinical trials' and defines 'scenarios' with specific probability values (e.g., 'p0 = [0.06 0.12 0.20 0.30 0.40 0.50]'). These are generated scenarios for a simulation study, not publicly available external datasets.
Dataset Splits	No	The paper conducts a simulation study using defined 'scenarios' and 'cohorts of patients of size 3', with a 'budget n = 36' or 'n = 60' for the total number of patients in a trial. These describe the setup of the simulated trials and the patient allocation strategy, but not traditional training, validation, or test splits of a fixed dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the simulation studies or experiments.
Software Dependencies	Yes	In practice, we use the Stan implementation of these Monte-Carlo sampler (Stan Development Team, 2015), and use (many) samples to approximate integrals under the posterior when needed. Stan Development Team. Stan modeling language users guide and reference manual. http://mc-stan.org, version 2.8.0, 2015.
Experiment Setup	Yes	We used a start-up phase for all designs (starting from the smallest dose and escalating until the ﬁrst toxicity is observed) and we also used cohorts of patients of size 3. We report experiments with the value ε = 0.05 for TS(ε) and c1 = 0.8 for TS A. Furthermore, we use the same parameters for the admissible set and the implementation of MTARA as those chosen by Riviere et al. (2017): ξ = 0.2, c1 = 0.9, c2 = 0.4, and s1 = .2 1 I / n