High-Dimensional Bayesian Optimization via Semi-Supervised Learning with Optimized Unlabeled Data Sampling

Authors: Yuxuan Yin, Yu Wang, Peng Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed TSBO on various challenging high dimensional datasets and show superior data efficiency improvement. In a chemical design task (Sterling & Irwin, 2015) and an expression reconstruction task (Kusner et al., 2017), we achieve SOTA results compared to recent BO approaches.
Researcher Affiliation Academia Yuxuan Yin 1 Yu Wang 1 Peng Li 1 1Department of Electrical and Computer Engineering, University of California, Santa Barbara, USA.
Pseudocode Yes Algorithm 1 Bi-Level Optimization of the Teacher-Student Model
Open Source Code Yes The implementation is available at https://github. com/reminiscenty/TSBO-Official.
Open Datasets Yes The first dataset comprises 40,000 single-variable arithmetic expressions, and is employed for an arithmetic expression reconstruction task (Kusner et al., 2017). The second ZINC250K dataset (Sterling & Irwin, 2015), consisting of 250,000 molecules, is used for two chemical design tasks with two objective molecule profiles: the penalized water-octanol partition coefficient (Penalized Log P) (G omez-Bombarelli et al., 2018) and the Ranolazine Multi Property Objective (Ranolazine MPO) (Brown et al., 2019).
Dataset Splits Yes Number of validation data 10 30
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software components like MLP, GP with RBF kernel, and Adam optimizer, but does not provide specific version numbers for programming languages or libraries used in its implementation.
Experiment Setup Yes Table 6: Hyperparameters of TSBO