BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization

Authors: Maximilian Balandat, Brian Karrer, Daniel Jiang, Samuel Daulton, Ben Letham, Andrew G. Wilson, Eytan Bakshy

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experiments Our results provide three main takeaways. First, we find that BOTORCH s algorithms tend to achieve greater sample efficiency compared to those of other packages (all packages use their default models and settings). Second, we find that OKG often outperforms all other acquisition functions.
Researcher Affiliation Collaboration Maximilian Balandat Facebook balandat@fb.com Brian Karrer Facebook briankarrer@fb.com Daniel R. Jiang Facebook drjiang@fb.com Samuel Daulton Facebook sdaulton@fb.com Benjamin Letham Facebook bletham@fb.com Andrew Gordon Wilson New York University andrewgw@cims.nyu.edu Eytan Bakshy Facebook ebakshy@fb.com
Pseudocode Yes Code Example 1: Multi-objective optimization via augmented Chebyshev scalarizations. Code Example 2: Parallel Noisy EI. Code Example 3: Implementation of One-Shot KG
Open Source Code Yes We make these methodological and theoretical contributions available in our open-source library BOTORCH (https://botorch.org), a modern programming framework for BO that features a modular design and flexible API, our distinct SAA approach, and algorithms specifically designed to exploit modern computing paradigms such as parallelization and auto-differentiation.
Open Datasets Yes Synthetic Test Functions: We consider BO for parallel optimization of q = 4 design points, on four noisy synthetic functions used in Wang et al. [100]: Branin, Rosenbrock, Ackley, and Hartmann. [...] (2) Tuning 6 parameters of a neural network surrogate model for the UCI Adult data set [56] introduced by Falkner et al. [22], available as part of HPOlib2 [21] [...] (3) Tuning 3 parameters of the recently proposed Stochastic Weight Averaging (SWA) procedure of Izmailov et al. [40] on the VGG-16 [93] architecture for CIFAR-10
Dataset Splits No The paper mentions using several datasets and functions (e.g., Hartmann, Cartpole, UCI Adult, CIFAR-10) but does not explicitly provide specific details on how these datasets were split into training, validation, and test sets (e.g., exact percentages, sample counts, or explicit statements about predefined splits used for reproduction).
Hardware Specification No Section 6.1 'Exploiting Parallelism and Hardware Acceleration' states that experiments were run 'on both CPU and GPU' but does not specify the exact models or specifications of these hardware components.
Software Dependencies No The paper mentions key software like PyTorch, GPyTorch, and TensorFlow, but does not provide specific version numbers for these software dependencies (e.g., 'probabilistic models written in Py Torch', 'an efficient and scalable implementation of GPs, GPy Torch [29]', 'from Tensor Flow [via GPFlow, 64]').
Experiment Setup No While the paper describes the scope of experiments (e.g., 'parallel optimization of q = 4 design points', 'Tuning 5 parameters of a deep Q-network'), it does not provide concrete, specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs), optimizer settings, or explicit training configurations in the main text.