Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Lenient Regret for Multi-Armed Bandits

Authors: Nadav Merlis, Shie Mannor8950-8957

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present an empirical evaluation of ϵ-TS. Speciﬁcally, we compare ϵ-TS to the vanilla TS on two different gap functions: f( ) = , which leads to the standard regret, and the hinge function f( ) = max{ ϵ, 0}. All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3.
Researcher Affiliation	Collaboration	1Technion Institute of Technology, Israel 2Nvidia Research, Israel
Pseudocode	Yes	Algorithm 1 ϵ-TS for Bernoulli arms
Open Source Code	No	The paper does not contain any explicit statements about making the source code available, nor does it provide links to a code repository.
Open Datasets	No	The paper mentions using “Bernoulli rewards” and running simulations “over 50,000 different seeds,” indicating a synthetic data generation process based on a known distribution rather than a specific, named public dataset with access information or a citation.
Dataset Splits	No	The paper describes simulation-based evaluations (“over 50,000 different seeds” with “Bernoulli rewards”) and analyzes asymptotic and finite-time behavior, but it does not specify explicit train/validation/test dataset splits or their percentages.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, memory specifications) used to run the experiments.
Software Dependencies	No	The paper does not specify any software dependencies or their version numbers (e.g., programming languages, libraries, frameworks) used in the experiments.
Experiment Setup	Yes	All evaluations were performed for ϵ = 0.2 over 50, 000 different seeds and are depicted in Figure 3. and We tested 4 different scenarios when the optimal arm is smaller or larger than 1 ϵ (left and right columns, respectively), and when the minimal gap is larger or smaller than ϵ (top and bottom rows, respectively).