reproducibilityindex.ai

On Limited-Memory Subsampling Strategies for Bandits

Authors: Dorian Baudry, Yoan Russac, Olivier Cappé

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical simulations highlight the merits of this approach, particularly when the changes are not only affecting the means of the rewards.
Researcher Affiliation	Academia	1Univ. Lille, CNRS, Inria, Centrale Lille, UMR 9198-CRISt AL, F-59000 Lille, France 2DI ENS, CNRS, Inria, ENS, Université PSL, Paris, France.
Pseudocode	Yes	Algorithm 1 LB-SDA
Open Source Code	Yes	The code for obtaining the different ﬁgures reported in the paper is available at https://github.com/YRussac/ LB-SDA.
Open Datasets	No	The paper uses simulated environments based on Bernoulli and Gaussian distributions, rather than pre-existing public datasets, and therefore does not provide access information for a public dataset.
Dataset Splits	No	The paper describes experiments on simulated bandit environments and does not specify traditional train/validation/test dataset splits. The performance is evaluated over a 'horizon T' using 'independent replications'.
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU or CPU models, or memory amounts) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies (e.g., library names with version numbers like Python 3.8, PyTorch 1.9) required to replicate the experiments.
Experiment Setup	Yes	To allow for fair comparison, we use for SW-LB-SDA, the same value of τ = 2 p T log(T)/ΓT that is recommended for SW-UCB (Garivier & Moulines, 2011). D-UCB uses the discount factor suggested by Garivier & Moulines (2011), 1/(1 γ) = 4 p T/ΓT . For CUSUM, α and h are tuned using suggestions from Liu et al. (2017), namely α = p ΓT /T log(T/ΓT ) and h = log(T/ΓT ).