Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits

Authors: Ji Cheng, Bo Xue, Jiaxiang Yi, Qingfu Zhang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretical analysis as well as numerical experiments demonstrate the effectiveness of our algorithms.
Researcher Affiliation Academia 1Department of Computer Science, City University of Hong Kong 2The City University of Hong Kong Shenzhen Research Institute 3Department of Material Engineering, Delft University of Technology {J.Cheng, boxue4-c}@my.cityu.edu.hk, J.Yi@tudelft.nl, qingfu.zhang@cityu.edu.hk
Pseudocode Yes Algorithm 1: MOSLB-PC, Algorithm 2: Prior Free Lexicographical filter (PFLF), Algorithm 3: MOSLB-PL
Open Source Code Yes Implementation code can be accessed via our webpage1. 1https://github.com/jicheng9617/moslb
Open Datasets No We generated 5d arms uniformly from the centered unit ball. The paper describes synthetic data generation, but does not provide access information or citations for a publicly available or open dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or citations to predefined splits) for reproducibility. It mentions a time horizon T=3000 rounds.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We first experimented in the environment with m = 5 objectives and the priority chains represented by {(1, 2), (3, 4, 5)}, which means that the first chain contains two objectives in lexicographic order while the second chain has three objectives. Three settings, the context s dimension d are picked from {5, 10, 15}, were investigated, and the unknown coefficients θ i are sampled uniformly from the unit ball. We generated 5d arms uniformly from the centered unit ball. Since the algorithms involve randomness, we carried out 10 trials with round T = 3000 and reported the outcomes in Fig. 3, where the lines represent average performance among ten trials and the shadow area shows the variance.