Simultaneously Learning Stochastic and Adversarial Bandits under the Position-Based Model
Authors: Cheng Chen, Canzhe Zhao, Shuai Li6202-6210
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments show that our algorithm could simultaneously learn in both stochastic and adversarial environments and is competitive compared to existing methods that are designed for a single environment. |
| Researcher Affiliation | Academia | 1 Nanyang Technological University 2 Shanghai Jiao Tong University |
| Pseudocode | Yes | Algorithm 1: FTRL-PBM |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. |
| Open Datasets | No | The paper describes synthetic data generation parameters and mentions real-world data deferred to Appendix, but does not provide concrete access information (link, DOI, repository, or formal citation) for any publicly available or open dataset used for training. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For all experiments, we use n = 10 items and m = 5 positions. For the synthetic data, we set the position examination probabilities to β = (1, 1/5) which are commonly adopted in previous works (Wang et al. 2018; Li, Lattimore, and Szepesv ari 2019). The attractiveness of items are set as α = (0.95, 0.95^2 , ..., 0.95^9 ). We consider two cases of Δ = 0.03 and Δ = 0.01. |