Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes
Authors: Yunyue Wei, Vincent Zhuang, Saraswati Soedarmadji, Yanan Sui
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Focal BO can efficiently leverage large amounts of offline and online data to achieve state-of-the-art performance on robot morphology design and to control a 585-dimensional musculoskeletal system. and In this section, we extensively evaluate Focal BO over a variety of tasks. |
| Researcher Affiliation | Collaboration | Yunyue Wei1, Vincent Zhuang2, Saraswati Soedarmadji1, Yanan Sui1 1 Tsinghua University 2 Google Deep Mind |
| Pseudocode | Yes | Algorithm 1 Focal BO and Algorithm 2 Focal Acq |
| Open Source Code | Yes | Our code for fully reproducing all experimental results is in the: https://github.com/yunyuewei/Focal BO. |
| Open Datasets | Yes | We compare Focal BO to several baselines over robot morphology design task from Design-Bench, which provides large offline dataset with an exact function oracle[6]. and We use the musculoskeletal system from [62], which enables foward simlation with Mujoco[65] and environment customization. |
| Dataset Splits | Yes | The offline dataset contains 2000 random data points and the online budget is 500 with batch size of 10. and In this task, we use the training dataset with 10,000 points and additionally evaluate 128 points on-the-fly with batch size of 4. |
| Hardware Specification | Yes | all experiment are conducted on a server with Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz, NVIDIA-A100 and 512Gb memory. |
| Software Dependencies | Yes | We implement Focal BO with Bo Torch1, which is a popular library for BO implementation with GPU acceleration. and We choose D Kitty morphology design for its consistency in function values between offline dataset and online function oracle, and its compatibility with python 3.8+. and For each round of GP training, we fit GP hyperparameters (and variational parameters for focalized GP and SVGP) for 1000 epochs via Adam optimizer[63] with learning rate as 0.01. |
| Experiment Setup | Yes | For all GP, we use Matern 5/2 kernel with automatic relevance determination, and do not restrict the lengthscale or noise range. For each round of GP training, we fit GP hyperparameters (and variational parameters for focalized GP and SVGP) for 1000 epochs via Adam optimizer[63] with learning rate as 0.01. For focalized GP and SVGP, we initialize the inducing points using Sobol sampler[64] over input space. |