Improved Algorithms for Conservative Exploration in Bandits
Authors: Evrard Garcelon, Mohammad Ghavamzadeh, Alessandro Lazaric, Matteo Pirotta3962-3969
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we provide empirical evidence of the advantage of the Martingale lower-bound and the action selection process in synthetic and real-data problems. |
| Researcher Affiliation | Industry | 1Facebook AI Research, evrard.garcelon@gmail.com, {mgh, lazaric, pirotta}@fb.com |
| Pseudocode | Yes | Algorithm 1: CLUCB2 (T = 1) and CLUCB2T |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Dataset-based Environments Fig. 5 reports the results using the Jester Dataset (Goldberg et al. 2001) |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. It mentions using the Jester dataset but not how it was partitioned for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run its experiments, such as GPU/CPU models, memory, or cloud computing specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as library names with version numbers (e.g., Python, PyTorch, or other relevant packages with their versions). |
| Experiment Setup | Yes | The conservative level α is set to 0.05, the horizon n to 106 (T = 1) and δ = 0.01. |