Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Explore-Then-Commit strategies
Authors: Aurelien Garivier, Tor Lattimore, Emilie Kaufmann
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore we provide empirical evidence that the theory also holds in practice and discuss extensions to non-gaussian and multiple-armed case. Numerical experiments illustrate and empirically support our results in Section 5. |
| Researcher Affiliation | Academia | Aurélien Garivier Institut de Mathématiques de Toulouse; UMR5219 Université de Toulouse; CNRS UPS IMT, F-31062 Toulouse Cedex 9, France EMAIL Emilie Kaufmann Univ. Lille, CNRS, Centrale Lille, Inria Seque L UMR 9189, CRISt AL Centre de Recherche en Informatique Signal et Automatique de Lille F-59000 Lille, France EMAIL Tor Lattimore University of Alberta 116 St & 85 Ave, Edmonton, AB T6G 2R3, Canada EMAIL |
| Pseudocode | Yes | Algorithm 1: FB-ETC algorithm; Algorithm 2: SPRT ETC algorithm; Algorithm 3: BAI-ETC algorithm; Algorithm 4: -UCB; Algorithm 5: UCB |
| Open Source Code | No | The paper does not contain an unambiguous statement or link to open-source code for the methodology described. |
| Open Datasets | No | The paper does not mention using any publicly available dataset or provide links/citations for data access. It performs numerical experiments with a simulated 'bandit problem'. |
| Dataset Splits | No | The paper describes '4.105 Monte-Carlo replications' for estimating regret but does not provide specific train/validation/test dataset splits. The experiments appear to be numerical simulations rather than based on a distinct dataset with such splits. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We represent here the regret of the five strategies presented in this article on a bandit problem with = 1/5, for different values of the horizon. The regret is estimated by 4.105 Monte-Carlo replications. |