Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Robust Bandit Learning with Imperfect Context
Authors: Jianyi Yang, Shaolei Ren10594-10602
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we apply Max Min UCB and Min WD to online edge datacenter selection, and run synthetic simulations to validate our theoretical analysis. and In Fig. 2, we compare different algorithms in terms of three cumulative regret objectives: robust regret in Eqn. (7), worst-case regret in Eqn. (8) and true regret in Eqn. (2). |
| Researcher Affiliation | Academia | Jianyi Yang, Shaolei Ren University of California, Riverside EMAIL |
| Pseudocode | Yes | Algorithm 1 Robust Arm Selection with Imperfect Context |
| Open Source Code | No | No explicit statement about providing open-source code for the methodology or links to code repositories are found. |
| Open Datasets | No | Finally, we apply Max Min UCB and Min WD to online edge datacenter selection, and run synthetic simulations to validate our theoretical analysis. and Given a sequence of true contexts, imperfect context sequence is generated by sampling i.i.d. uniform distribution over B (xt) at each round. This indicates synthetic data, not a publicly available dataset with access information. |
| Dataset Splits | No | The paper describes synthetic simulations over Time Slots and an online learning scenario but does not provide specific train/validation/test dataset splits. |
| Hardware Specification | No | The paper describes running synthetic simulations but does not provide any specific hardware details such as GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions specific parameters for the Gaussian kernel and other settings but does not list any software dependencies with specific version numbers. |
| Experiment Setup | Yes | In the simulations, Gaussian kernel with parameter 0.1 is used for reward (loss) estimation. λ in Eqn. (3) is set as 0.1. The exploration rate is set as ht = 0.04. |