Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Improved Algorithms for Conservative Exploration in Bandits
Authors: Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta3962-3969
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we provide empirical evidence of the advantage of the Martingale lower-bound and the action selection process in synthetic and real-data problems. |
| Researcher Affiliation | Industry | 1Facebook AI Research, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: CLUCB2 (T = 1) and CLUCB2T |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Dataset-based Environments Fig. 5 reports the results using the Jester Dataset (Goldberg et al. 2001) |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. It mentions using the Jester dataset but not how it was partitioned for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run its experiments, such as GPU/CPU models, memory, or cloud computing specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as library names with version numbers (e.g., Python, PyTorch, or other relevant packages with their versions). |
| Experiment Setup | Yes | The conservative level α is set to 0.05, the horizon n to 106 (T = 1) and δ = 0.01. |