Bandits with many optimal arms
Authors: Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Experiments We compare our algorithm against other standard bandit algorithms via numerical simulations. The results reported are averaged over 100 independent trials. |
| Researcher Affiliation | Collaboration | Alon Cohen: Massachusetts Institute of Technology, Cambridge, MA, USA, alonco@mit.edu. Yishay Mansour: Google and Tel Aviv University, Tel Aviv, Israel, mansour.yishay@gmail.com. |
| Pseudocode | No | The paper describes the algorithm mathematically and textually, but does not include a dedicated 'Pseudocode' or 'Algorithm' block with structured steps. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | No | The paper conducts numerical simulations rather than using a publicly available dataset. It describes the simulation setup (e.g., '100 independent trials', 'Gaussian distribution with mean µ and variance σ2 = 0.01'). There is no external dataset to provide access information for. |
| Dataset Splits | No | The paper discusses numerical simulations and hyperparameter tuning but does not explicitly provide details on train/validation/test dataset splits, as is common in supervised learning. It mentions: 'The hyperparameter C was selected to achieve the best empirical performance on the range of parameters we tested.' |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the numerical experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or solvers). |
| Experiment Setup | Yes | The paper provides details on the experimental setup, including the number of trials ('averaged over 100 independent trials'), the horizon ('T = 10000'), number of arms ('K = 100'), and the distribution parameters for rewards ('Gaussian distribution with mean µ and variance σ2 = 0.01'). It also mentions that 'The hyperparameter C was selected to achieve the best empirical performance on the range of parameters we tested.' |