The Behavior and Convergence of Local Bayesian Optimization
Authors: Kaiwen Wu, Kyurae Kim, Roman Garnett, Jacob Gardner
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first study the behavior of the local approach, and find that the statistics of individual local solutions of Gaussian process sample paths are surprisingly good compared to what we would expect to recover from global methods. We then present the first rigorous analysis of such a Bayesian local optimization algorithm recently proposed by Müller et al. (2021), and derive convergence rates in both the noiseless and noisy settings. 6 Additional Experiments |
| Researcher Affiliation | Academia | Kaiwen Wu University of Pennsylvania kaiwenwu@seas.upenn.edu Kyurae Kim University of Pennsylvania kyrkim@seas.upenn.edu Roman Garnett Washington University in St. Louis garnett@wustl.edu Jacob R. Gardner University of Pennsylvania jacobrg@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1: A Local Bayesian Optimization Algorithm |
| Open Source Code | Yes | The code is available at https://github.com/kayween/local-bo-convergence. |
| Open Datasets | No | Specifically, we study the local solutions of functions f drawn from Gaussian processes with known hyperparameters. In this setting, Gaussian process sample paths can be drawn as differentiable functions adapting the techniques described in Wilson et al. [29]. The paper does not use or link to a publicly available dataset in the traditional sense; it generates data (GP sample paths) for its experiments. |
| Dataset Splits | No | The paper does not specify train/validation/test splits for any dataset, as its experiments involve optimizing functions drawn from Gaussian processes rather than using pre-existing partitioned datasets. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, or memory) used to conduct the experiments. |
| Software Dependencies | No | The paper mentions software components like 'BFGS' and 'L-BFGS' and specific algorithms like 'GP-UCB' but does not specify version numbers for any programming languages, libraries, or detailed software dependencies. |
| Experiment Setup | Yes | To do this, we optimize 50 sample paths starting from x = 0 from a centered Gaussian process with a RBF kernel (unit outputscale and unit lengthscale) using a variety of observation noise standard deviations σ. We then run iterations of local Bayesian optimization (as described later in Algorithm 1) to convergence or until an evaluation budget of 5000 is reached. In the noiseless setting (σ = 0), we modify Algorithm 1 to pass our gradient estimates to BFGS rather than applying the standard gradient update rule for efficiency. |