Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes

Authors: Yunyue Wei, Vincent Zhuang, Saraswati Soedarmadji, Yanan Sui

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that Focal BO can efficiently leverage large amounts of offline and online data to achieve state-of-the-art performance on robot morphology design and to control a 585-dimensional musculoskeletal system. and In this section, we extensively evaluate Focal BO over a variety of tasks.
Researcher Affiliation Collaboration Yunyue Wei1, Vincent Zhuang2, Saraswati Soedarmadji1, Yanan Sui1 1 Tsinghua University 2 Google Deep Mind
Pseudocode Yes Algorithm 1 Focal BO and Algorithm 2 Focal Acq
Open Source Code Yes Our code for fully reproducing all experimental results is in the: https://github.com/yunyuewei/Focal BO.
Open Datasets Yes We compare Focal BO to several baselines over robot morphology design task from Design-Bench, which provides large offline dataset with an exact function oracle[6]. and We use the musculoskeletal system from [62], which enables foward simlation with Mujoco[65] and environment customization.
Dataset Splits Yes The offline dataset contains 2000 random data points and the online budget is 500 with batch size of 10. and In this task, we use the training dataset with 10,000 points and additionally evaluate 128 points on-the-fly with batch size of 4.
Hardware Specification Yes all experiment are conducted on a server with Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz, NVIDIA-A100 and 512Gb memory.
Software Dependencies Yes We implement Focal BO with Bo Torch1, which is a popular library for BO implementation with GPU acceleration. and We choose D Kitty morphology design for its consistency in function values between offline dataset and online function oracle, and its compatibility with python 3.8+. and For each round of GP training, we fit GP hyperparameters (and variational parameters for focalized GP and SVGP) for 1000 epochs via Adam optimizer[63] with learning rate as 0.01.
Experiment Setup Yes For all GP, we use Matern 5/2 kernel with automatic relevance determination, and do not restrict the lengthscale or noise range. For each round of GP training, we fit GP hyperparameters (and variational parameters for focalized GP and SVGP) for 1000 epochs via Adam optimizer[63] with learning rate as 0.01. For focalized GP and SVGP, we initialize the inducing points using Sobol sampler[64] over input space.