reproducibilityindex.ai

A One-Size-Fits-All Solution to Conservative Bandit Problems

Authors: Yihan Du, Siwei Wang, Longbo Huang7254-7261

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments for the considered problems. The results match our theoretical bounds and demonstrate that our algorithms achieve the performance superiority compared to existing algorithms.
Researcher Affiliation	Academia	Yihan Du,1 Siwei Wang,1 Longbo Huang1 1 Tsinghua University
Pseudocode	Yes	Algorithm 1: General Solution to Conservative Bandits (Gen CB). Algorithm 2: MV-CUCB.
Open Source Code	No	The paper does not provide a direct link to the source code or explicitly state that the code for the described methodology is open-source. The URL in the reference section points to the paper's arXiv preprint.
Open Datasets	No	In all experiments, we assume the rewards to take i.i.d. Bernoulli values. For CMAB, we set K {24, 72, 144}, α {0.05, 0.1, 0.15}, µ0 = 0.7 and µ1, . . . , µK as an arithmetic sequence from 0.8 to 0.2. For CLB and CCCB, we set d {5, 7, 9}, α {0.01, 0.02, 0.03}, K = 2d and f(A, w ) = P e A w e. For MV-CBP, we use the same parameter settings as CMAB and additionally set ρ {10, 30, 60}. This indicates that the data is simulated, not from a publicly available dataset with concrete access information.
Dataset Splits	No	The paper describes how synthetic data is generated for simulations, but does not mention specific training, validation, or testing splits of any fixed dataset.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions various algorithms (e.g., UCB, Lin UCB, C2UCB) that are integrated into Gen CB, but it does not specify any software dependencies (e.g., programming languages, libraries, frameworks) with version numbers.
Experiment Setup	Yes	In all experiments, we assume the rewards to take i.i.d. Bernoulli values. For CMAB, we set K {24, 72, 144}, α {0.05, 0.1, 0.15}, µ0 = 0.7 and µ1, . . . , µK as an arithmetic sequence from 0.8 to 0.2. For CLB and CCCB, we set d {5, 7, 9}, α {0.01, 0.02, 0.03}, K = 2d and f(A, w ) = P e A w e. For MV-CBP, we use the same parameter settings as CMAB and additionally set ρ {10, 30, 60}. For each algorithm, we perform 50 independent runs and present the average (middle curve), maximum (upper curve) and minimum (bottom curve) cumulative regrets across runs.