Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features

Authors: Mojmir Mutny, Andreas Krause

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present cumulative regret plots for standard benchmarks with the squared exponential kernel (Figure 3). We test Thompson sampling with QFF for a fixed horizon with high-dimensional functions used previously in [14]. Details of the experiments are in the supplementary material. We compare QFF, RFF and the exact GP. In Figure 3f, we show that for each experiment the speed of computation improves significantly even though for high dimensional experiments the grid for Lipschitz optimization was twice as fine as for the exact method.
Researcher Affiliation Academia Mojmír Mutný Department of Computer Science ETH Zurich, Switzerland mojmir.mutny@inf.ethz.ch Andreas Krause Department of Computer Science ETH Zurich, Switzerland krausea@inf.ethz.ch
Pseudocode Yes Algorithm 1 Thompson sampling with Fourier Features and additive models
Open Source Code No The paper does not provide concrete access to source code for the described methodology.
Open Datasets Yes We present cumulative regret plots for standard benchmarks with the squared exponential kernel (Figure 3). We test Thompson sampling with QFF for a fixed horizon with high-dimensional functions used previously in [14].
Dataset Splits No The paper describes experiments on benchmark functions and a simulator but does not provide explicit training/validation/test dataset splits with percentages or sample counts.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper does not provide specific software dependencies or version numbers (e.g., programming language, libraries, or solvers with version numbers).
Experiment Setup Yes Let Φ(j)( ) Rmj be the approximation of the jth additive component as in Definition 3 with mj 2 logη(T 3)dj and mj 1 γ2 j , where η = 16/e. ... where each acquisition function is optimized to the accuracy αt = 1/t. and the grid for Lipschitz optimization was twice as fine as for the exact method.