Robust Bandit Learning with Imperfect Context
Authors: Jianyi Yang, Shaolei Ren10594-10602
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we apply Max Min UCB and Min WD to online edge datacenter selection, and run synthetic simulations to validate our theoretical analysis. and In Fig. 2, we compare different algorithms in terms of three cumulative regret objectives: robust regret in Eqn. (7), worst-case regret in Eqn. (8) and true regret in Eqn. (2). |
| Researcher Affiliation | Academia | Jianyi Yang, Shaolei Ren University of California, Riverside {jyang239, shaolei}@ucr.edu |
| Pseudocode | Yes | Algorithm 1 Robust Arm Selection with Imperfect Context |
| Open Source Code | No | No explicit statement about providing open-source code for the methodology or links to code repositories are found. |
| Open Datasets | No | Finally, we apply Max Min UCB and Min WD to online edge datacenter selection, and run synthetic simulations to validate our theoretical analysis. and Given a sequence of true contexts, imperfect context sequence is generated by sampling i.i.d. uniform distribution over B (xt) at each round. This indicates synthetic data, not a publicly available dataset with access information. |
| Dataset Splits | No | The paper describes synthetic simulations over Time Slots and an online learning scenario but does not provide specific train/validation/test dataset splits. |
| Hardware Specification | No | The paper describes running synthetic simulations but does not provide any specific hardware details such as GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions specific parameters for the Gaussian kernel and other settings but does not list any software dependencies with specific version numbers. |
| Experiment Setup | Yes | In the simulations, Gaussian kernel with parameter 0.1 is used for reward (loss) estimation. λ in Eqn. (3) is set as 0.1. The exploration rate is set as ht = 0.04. |