Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits
Authors: Ji Cheng, Bo Xue, Jiaxiang Yi, Qingfu Zhang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Theoretical analysis as well as numerical experiments demonstrate the effectiveness of our algorithms. |
| Researcher Affiliation | Academia | 1Department of Computer Science, City University of Hong Kong 2The City University of Hong Kong Shenzhen Research Institute 3Department of Material Engineering, Delft University of Technology {J.Cheng, boxue4-c}@my.cityu.edu.hk, J.Yi@tudelft.nl, qingfu.zhang@cityu.edu.hk |
| Pseudocode | Yes | Algorithm 1: MOSLB-PC, Algorithm 2: Prior Free Lexicographical filter (PFLF), Algorithm 3: MOSLB-PL |
| Open Source Code | Yes | Implementation code can be accessed via our webpage1. 1https://github.com/jicheng9617/moslb |
| Open Datasets | No | We generated 5d arms uniformly from the centered unit ball. The paper describes synthetic data generation, but does not provide access information or citations for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) for reproducibility. It mentions a time horizon T=3000 rounds. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We first experimented in the environment with m = 5 objectives and the priority chains represented by {(1, 2), (3, 4, 5)}, which means that the first chain contains two objectives in lexicographic order while the second chain has three objectives. Three settings, the context s dimension d are picked from {5, 10, 15}, were investigated, and the unknown coefficients θ i are sampled uniformly from the unit ball. We generated 5d arms uniformly from the centered unit ball. Since the algorithms involve randomness, we carried out 10 trials with round T = 3000 and reported the outcomes in Fig. 3, where the lines represent average performance among ten trials and the shadow area shows the variance. |