Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Best Model Identification: A Rested Bandit Formulation

Authors: Leonardo Cella, Massimiliano Pontil, Claudio Gentile

ICML 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Appendix D we included simple preliminary experiments on synthetic data that help corroborate our theoretical ๏ฌndings.
Researcher Affiliation Collaboration 1Italian Institute of Technology, Genoa, Italy 2University College London, United Kingdom 3Google Research, New York, USA.
Pseudocode Yes Algorithm 1 Explore-Then-Commit (ETC) [...] Algorithm 2 REST-SURE
Open Source Code No The paper does not provide any explicit statement or link for open-source code for the methodology described.
Open Datasets No We use a synthetic dataset with K = 2 arms and a time horizon T = 1000...
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits. It mentions using synthetic data for preliminary experiments but no specific split information.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments.
Software Dependencies No The paper does not provide specific version numbers for ancillary software dependencies.
Experiment Setup No The paper mentions parameters for the synthetic data setup (K=2, T=1000) in Appendix D, but does not provide specific experimental setup details such as hyperparameters or system-level training settings for the algorithms themselves in the main text.