Combinatorial Multi-Armed Bandits with Concave Rewards and Fairness Constraints
Authors: Huanle Xu, Yang Liu, Wing Cheong Lau, Rui Li
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we evaluate the performance of our algorithm via extensive simulations and demonstrate that it outperforms the baselines substantially. In this section, we conduct simulation studies to evaluate the performance of CMF in terms of both the time-average regret and the violation of fairness requirements. |
| Researcher Affiliation | Academia | Huanle Xu1 , Yang Liu2 , Wing Cheong Lau2 and Rui Li1 1Dongguan University of Technology, 2The Chinese University of Hong Kong |
| Pseudocode | Yes | Algorithm 1: Combinational Multi-Arm Selection Algorithm with Fairness Guarantees |
| Open Source Code | No | The paper does not include any statement or link regarding the public release of its source code. |
| Open Datasets | No | The paper describes generating data for simulations ('The expected reward for all arms are uniformly chosen between [0,1]. For each arm, the actual rewards in all rounds are generated following the Pareto distribution with the order of two.') but does not refer to a publicly available or open dataset with access information. |
| Dataset Splits | No | The paper describes simulation parameters such as the number of rounds (T) and values for α, but does not specify dataset splits (e.g., training, validation, test percentages or counts) for model training and evaluation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU/GPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper does not list any specific software dependencies with version numbers, such as programming languages, libraries, or solvers used for implementation or experimentation. |
| Experiment Setup | Yes | We choose f to be a linear function and consider the following scenario for the simulation: N = 100 and m = 30. The values of ξ are generated uniformly at random between [0.01, 1] and PN i=1 ξi = 15. The expected reward for all arms are uniformly chosen between [0,1]. For each arm, the actual rewards in all rounds are generated following the Pareto distribution with the order of two. We first evaluate the impact of α on the regret performance as well as the overall violation of the fairness for each arm. To be specific, we simulate our proposed CMF with α = {1, 100, 10000, ∞} and illustrate the results in Fig. 1. |