Bandits with Concave Aggregated Reward
Authors: Yingqi Yu, Sijia Zhang, Shaoang Li, Lan Zhang, Wei Xie, Xiang-Yang Li
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive simulations demonstrate that our algorithms achieve better results than the most advanced bandit algorithms. |
| Researcher Affiliation | Academia | University of Science and Technology of China, Hefei, China Institute of Artificial Intelligence, Hefei Comprehensive National Science Center |
| Pseudocode | Yes | Algorithm 1: SW-BCAR |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes generating data from 'truncated normal distributions' for simulations but does not refer to a publicly available dataset with a specific name, link, or formal citation. |
| Dataset Splits | No | The paper describes simulation parameters and performance evaluation over different settings, but it does not specify training, validation, and test dataset splits in the context of data partitioning for reproduction. |
| Hardware Specification | No | The paper discusses simulations and evaluations but does not specify any hardware details like GPU/CPU models or memory used for running the experiments. |
| Software Dependencies | No | The paper refers to benchmark algorithms but does not provide specific software names with version numbers for implementation dependencies (e.g., Python, PyTorch). |
| Experiment Setup | Yes | In the experiments, variables other than specified separately were fixed as follows: 1) the round number T = 20000; the arm number K = 2; 2) the optimal arm s mean value µ = 0.8; the suboptimal arms mean values µ(a) = 0.4; 3) the aggregated reward function f(x) = 1 + x 1; 4) the parameter for the value range σ = 2. |