Rotting Infinitely Many-Armed Bandits
Authors: Jung-Hun Kim, Milan Vojnovic, Se-Young Yun
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present results of numerical experiments for randomly generated problem instances of rotting infinitely manyarmed bandits. These results validate the insights derived from our theoretical results. |
| Researcher Affiliation | Academia | 1Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea 2London School of Economics, London, UK. Correspondence to: Se-Young Yun <yunseyoung@kaist.ac.kr>, Milan Vojnovi c <m.vojnovic@lse.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 UCB-Threshold Policy (UCB-TP); Algorithm 2 Adaptive UCB-Threshold Policy (AUCB-TP) |
| Open Source Code | Yes | Our code is available at https://github.com/ junghunkim7786/rotting_infinite_armed_ bandits |
| Open Datasets | No | The paper states 'We generate initial mean rewards of arms by sampling from uniform distribution on [0, 1].' This indicates the use of synthetic data generation rather than a specific, publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper uses synthetic data and does not mention explicit training, validation, or test dataset splits. The experiments are conducted over a 'time horizon T' rather than on pre-partitioned static datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed for replication. |
| Experiment Setup | Yes | We generate initial mean rewards of arms by sampling from uniform distribution on [0, 1]. In each time step, stochastic reward from pulling an arm has a Gaussian noise with mean zero and variance 1. We repeat each experiment 10 times and compute confidence intervals for confidence probability 0.95. |