Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously
Authors: Julian Zimmert, Haipeng Luo, Chen-Yu Wei
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on synthetic data show that our algorithm indeed performs well over different environments. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Copenhagen, Copenhagen, Denmark 2Department of Computer Science, University of Southern California, United States. |
| Pseudocode | Yes | Algorithm 1 FTRL with hybrid regularizer for semi-bandits |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper uses 'synthetic data' for experiments, describing its generation parameters (e.g., d=10, m=5, T=10^7, Δ=1/8) but does not provide a link or citation to a publicly available dataset. |
| Dataset Splits | No | The paper does not provide specific training, validation, or test dataset splits for its synthetic data. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions algorithms like EXP2, LOGBARRIER, COMBUCB, and THOMPSON SAMPLING but does not specify software versions for these or any other programming languages or libraries used. |
| Experiment Setup | Yes | We test the algorithms on concrete instances of the m-set problem with parameters: d = 10, m = 5, T = 10^7. Specifically the final learning rates ηt for our algorithm, EXP2 and LOGBARRIER are respectively 1/√t, 1/√t and 1/√t. We measure the performance of the algorithms by the average pseudo-regret over at least 20 runs. For COMBUCB and THOMPSON SAMPLING in the adversarial environment, we increase the number of runs to 500 and 1000 respectively due to the high variance of the pseudo-regret. |