High-Dimensional Dueling Optimization with Preference Embedding

Authors: Yangwenhui Zhang, Hong Qian, Xiang Shu, Aimin Zhou

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results verify that, on molecule discovery and web page recommendation dueling optimization tasks, the preferential intrinsic dimension exists and PE-DBO is superior in scalability compared with that of the state-of-the-art (SOTA) methods. and Experiments PE-DBO is implemented by Bo Torch (Balandat et al. 2020). The code is available at https://github.com/Zhangywh/PE-DBO. To study the effectiveness of preference embedding rather than the improvements on GP model, we use the default setting of GP model in Bo Torch. CMA-ES (Hansen, M uller, and Koumoutsakos 2003) is adopted as the optimizer of the acquisition function. PE-DBO is deployed on both synthetic testing functions and real-world tasks. There are three preferential optimization algorithms for us to compare, i.e., PBO (Gonz alez et al. 2017), KSS (Sui et al. 2017) and a pure compare version of comp-GP-UCB (Xu et al. 2020) which removes the second optimization part that uses the function value.
Researcher Affiliation Academia Yangwenhui Zhang, Hong Qian*, Xiang Shu, Aimin Zhou Shanghai Institute of AI for Education and School of Computer Science and Technology, East China Normal University, Shanghai 200062, China {51215901086, 51255901138}@stu.ecnu.edu.cn, {hqian, amzhou}@cs.ecnu.edu.cn
Pseudocode Yes Algorithm 1: PE-DBO Input: Initial dataset DM = {[yi; y i], pi}M i=1, number of available duels N, original dimension D, low dimension d and the boundary of low dimension subspace Y. Procedure: 1: Generate random matrix A RD d and Ap = [ A O O A ]. 2: for j = M to M + N 1 do 3: Fit a GP to Dj and learn πfp,j([y; y ]) . 4: Sample a function π ˆ fp from GP. 5: ynext = argmaxy Y R Y π ˆ fp([y; y ]; Dj) dy . 6: y next = argmaxy Y σ(GP|y = ynext, Dj) . 7: Get [xnext; x next] = Ap[ynext; y next] . 8: Run duel [xnext; x next] and obtain pj+1 . 9: Augment Dj+1 = {Dj ([ynext; y next], pj+1)} . 10: end for 11: Fit a GP to DM+N and find the solution y with highest soft-Copeland score. 12: return x = Ay .
Open Source Code Yes PE-DBO is implemented by Bo Torch (Balandat et al. 2020). The code is available at https://github.com/Zhangywh/PE-DBO.
Open Datasets Yes The first dataset is the Microsoft learning to rank (MSLR) dataset (Qin et al. 2010) and we use the MSLRWEB10K version which has more than 10,000 queries. Each data has 136 different features to describe a website page. For each website page, there is a relevance judgment which is an integer value from 0 to 4 that indicates whether this page is irrelevant or perfectly relevant. The second dataset is Ch EMBL dataset in design-bench (Trabucco et al. 2022), and design-bench is a benchmark for black-box model-based optimization problems.
Dataset Splits No In the synthetic testing function experiments, I = 500 points are used to estimate the integration of the soft-Copeland score, M = 30 duels are used to initialize the GP model and N = 50 duels are used for optimizing. (This describes initial data and optimization budget, not standard train/validation/test dataset splits).
Hardware Specification No No specific hardware details (like GPU/CPU models, memory) are mentioned for running experiments.
Software Dependencies No PE-DBO is implemented by Bo Torch (Balandat et al. 2020). The code is available at https://github.com/Zhangywh/PE-DBO. To study the effectiveness of preference embedding rather than the improvements on GP model, we use the default setting of GP model in Bo Torch. CMA-ES (Hansen, M uller, and Koumoutsakos 2003) is adopted as the optimizer of the acquisition function. (Specific version numbers for Bo Torch, CMA-ES, or other software are not provided).
Experiment Setup Yes In the synthetic testing function experiments, I = 500 points are used to estimate the integration of the soft-Copeland score, M = 30 duels are used to initialize the GP model and N = 50 duels are used for optimizing. The dimension of the low dimension subspace is 24, i.e., 2d = 24 and the boundary is [ 1, 1]24. and All experiments are repeated 20 times and the results are shown in Figure 1. and In the experiment on MSLR dataset, we set the boundary of low dimension subspace as [ d]2d (other different boundaries have also been tested and please cf. Appendix E for more details). and All experiments are repeated 20 times with N = 80 to optimize, M = 25, I = 600 for MSLR and M = 15, I = 300 for Ch EMBL.