Batched High-dimensional Bayesian Optimization via Structural Kernel Learning

Authors: Zi Wang, Chengtao Li, Stefanie Jegelka, Pushmeet Kohli

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Empirical Results We empirically evaluate our approach in two parts: First, we verify the effectiveness of using our Gibbs sampling algorithm to learn the additive structure of the unknown function, and then we test our batch BO for high dimensional problems with the Gibbs sampler.
Researcher Affiliation Collaboration 1Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Massachusetts, USA 2Deep Mind, London, UK.
Pseudocode No The full algorithm is shown in the appendix.
Open Source Code Yes Our code is available at https://github.com/zi-w/Structural-Kernel-Learning-for-HDBBO.
Open Datasets No We tested on 2, 10, 20 and 50-dimensional functions sampled from a zero-mean Add-GP... For a real-data experiment, we tested the diverse batch sampling algorithms for BBO on the Walker function which returns the walking speed of a three-link planar bipedal walker implemented in Matlab (Westervelt et al., 2007). ... We tested on 2, 10, 20 and 50-dimensional functions sampled the same way as in Section 5.1; we assume the ground-truth decomposition of the feature space is known.
Dataset Splits No Let Dn = {(xt, yt)}n t=1 be the data we observed from f... Given the observed data Dn = {(xt, yt)}n t=1... (The paper mentions 'observed data' but does not specify how this data is split into training, validation, or test sets).
Hardware Specification No We thank MIT Supercloud and the Lincoln Laboratory Supercomputing Center for providing computational resources.
Software Dependencies No implemented in Matlab (Westervelt et al., 2007)... implemented with a physics engine, the Box2D simulator (Catto, 2011).
Experiment Setup Yes For Add-GPUCB, we used β(m) t = |Am| log 2t for lower dimensions (D = 2, 5, 10), and β(m) t = |Am| log 2t/5 for higher dimensions (D = 20, 30, 50). We set the burn-in period to be 50 iterations, and the total number of iterations for Gibbs sampling to be 100. The learning of the decomposition is done every 50 iterations.