reproducibilityindex.ai

Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk

Authors: Tianrui Chen, Aditya Gangrade, Venkatesh Saligrama

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This theoretical analysis is complemented by simulation studies demonstrating the effectiveness of the proposed schema... Empirical Results. We complement the above theoretical study with simulations.
Researcher Affiliation	Academia	1Boston University 2Carnegie Mellon University.
Pseudocode	Yes	Algorithm 1 Doubly Optimistic Confidence Bounds; Algorithm 2 Thompson Sampling With Optimistic Safety Indices (TOPSI) for Bernoulli Bandits; Algorithm 3 Thompson Sampling with BAYESUCB (TSBU) for Bernoulli Bandits
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	For the sake of realism, we use the data of Genovese et al. (2013), who report efficacy and infection rates from a phase 2 randomised trial for various dosages of a drug to treat rheumatoid arthritis.
Dataset Splits	No	The paper describes conducting simulations over a 'horizon' and 'trials' (e.g., '100 trials of horizon 50000'), but it does not specify explicit training, validation, or test dataset splits in the context of typical machine learning reproduction.
Hardware Specification	No	The paper mentions that methods are implemented on MATLAB, but it does not specify any particular hardware (CPU, GPU models, memory, etc.) used for running the experiments.
Software Dependencies	No	The paper states 'All methods are implemented on MATLAB' and mentions specific MATLAB functions like 'betainv' and 'betarnd' from the Statistics Toolbox, but it does not provide specific version numbers for MATLAB or its toolboxes.
Experiment Setup	Yes	The data reported is across 100 trials of horizon 50000. ... We study the safety level 0.21... KL-UCB-based bounds are all evaluated with γt = 1/t... BAYESUCB -based bounds are all evaluated with δk t = 1/(t + 1).