reproducibilityindex.ai

On Explore-Then-Commit strategies

Authors: Aurelien Garivier, Tor Lattimore, Emilie Kaufmann

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Furthermore we provide empirical evidence that the theory also holds in practice and discuss extensions to non-gaussian and multiple-armed case. Numerical experiments illustrate and empirically support our results in Section 5.
Researcher Affiliation	Academia	Aurélien Garivier Institut de Mathématiques de Toulouse; UMR5219 Université de Toulouse; CNRS UPS IMT, F-31062 Toulouse Cedex 9, France aurelien.garivier@math.univ-toulouse.fr Emilie Kaufmann Univ. Lille, CNRS, Centrale Lille, Inria Seque L UMR 9189, CRISt AL Centre de Recherche en Informatique Signal et Automatique de Lille F-59000 Lille, France emilie.kaufmann@univ-lille1.fr Tor Lattimore University of Alberta 116 St & 85 Ave, Edmonton, AB T6G 2R3, Canada tor.lattimore@gmail.com
Pseudocode	Yes	Algorithm 1: FB-ETC algorithm; Algorithm 2: SPRT ETC algorithm; Algorithm 3: BAI-ETC algorithm; Algorithm 4: -UCB; Algorithm 5: UCB
Open Source Code	No	The paper does not contain an unambiguous statement or link to open-source code for the methodology described.
Open Datasets	No	The paper does not mention using any publicly available dataset or provide links/citations for data access. It performs numerical experiments with a simulated 'bandit problem'.
Dataset Splits	No	The paper describes '4.105 Monte-Carlo replications' for estimating regret but does not provide specific train/validation/test dataset splits. The experiments appear to be numerical simulations rather than based on a distinct dataset with such splits.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We represent here the regret of the ﬁve strategies presented in this article on a bandit problem with = 1/5, for different values of the horizon. The regret is estimated by 4.105 Monte-Carlo replications.