reproducibilityindex.ai

Quantile Bandits for Best Arms Identification

Authors: Mengyan Zhang, Cheng Soon Ong

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show illustrative experiments for best arm identiﬁcation. In this section, we illustrate how the proposed Q-SAR algorithm works on a toy example (Section 5.1) and demonstrate the empirical performance on a vaccine simulation (Section 5.2).
Researcher Affiliation	Academia	1The Australian National University 2Data61, CSIRO. Correspondence to: Cheng Soon Ong <chengsoon.ong@anu.edu.au>.
Pseudocode	Yes	Algorithm 1 Q-SAR
Open Source Code	Yes	1https://github.com/Mengyanz/QSAR
Open Datasets	No	The paper describes generating its own data for experiments (e.g., 'We generate 1000 rewards for each strategy by simulating the epidemic for 180 days using Flu TE 2', 'constructing three arms with absolute Gaussian distribution or exponential distribution') and links to the simulation tool, but does not explicitly state that the generated dataset itself is publicly available or provide a direct link/citation to it.
Dataset Splits	No	The paper sets a 'fixed budget of N rounds' for the bandit problem, which dictates the number of samples, but it does not describe specific training, validation, or test dataset splits in terms of percentages or counts, or refer to standard predefined splits.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Flu TE 2' (with a link to its GitHub repository) as a tool used for vaccine simulation, but does not specify a version number for this or any other software dependencies like programming languages or libraries.
Experiment Setup	Yes	We divide the budget N into K 1 phases. The number of samples drawn for each arm in each phase remains the same as in the Bubeck et al. (2013). Let the active set A1 = {1, ..., K}, the accepted set M1 = , the number of arms left to ﬁnd l1 = m, log(K) = 1/2 + PK i=2 1/i , n0 = 0, and for p {1, ..., K 1}, np = l 1 log(K) N K K+1 p m . We set up simulated environments by constructing three arms with absolute Gaussian distribution or exponential distribution. We generate 1000 rewards for each strategy by simulating the epidemic for 180 days using Flu TE 2 (with basic reproduction number R0 = 1.3).