reproducibilityindex.ai

Scalable Thompson Sampling using Sparse Gaussian Process Models

Authors: Sattar Vakili, Henry Moss, Artem Artemev, Vincent Dutordoir, Victor Picheny

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our algorithm achieves near state-of-the-art regret performance on standard benchmarks through extensive numerical experiments.
Researcher Affiliation	Academia	University of Cambridge, Imperial College London, Vector Institute, University of Toronto, The Alan Turing Institute
Pseudocode	Yes	Algorithm 1: Scalable Sparse GP-TS for Bayesian Optimization
Open Source Code	Yes	Our code is available at https://github.com/lucas-schmied/sgpts
Open Datasets	Yes	We benchmark SGPT-TS on common synthetic Bayesian optimization functions as well as a range of practical black-box optimization tasks from DeepGP [27] and BOCS [28]. For real-world datasets, we use the datasets from DeepGP [27], which consist of 9 real-world tasks. The datasets are downloaded from the authors’ GitHub repository.
Dataset Splits	No	The paper focuses on Bayesian Optimization, which typically does not involve explicit train/validation/test splits of a static dataset for model training, but rather evaluates performance over BO iterations. No specific dataset splits (e.g., percentages or counts) are mentioned for the benchmark functions or real-world tasks.
Hardware Specification	Yes	All experiments were run on an NVIDIA GeForce RTX 2080 Ti GPU.
Software Dependencies	No	Our code is implemented in PyTorch [39] and uses the GPyTorch [12] library for Gaussian processes. For the sparse GP components, we use the methods from Spearmint [42] (which relies on GPFlow [29]) and GPy [4].
Experiment Setup	Yes	All tasks are evaluated over 50 BO iterations across 10 random seeds. The maximum number of training points for our sparse GPs is set to 200, which leads to a batch size of 20 when using 10 random seeds per iteration. We follow [48] for initialization of the inducing points. We consider three acquisition functions: Expected Improvement (EI) [34], Upper Confidence Bound (UCB) [35] with parameter β = 1.0, and Thompson Sampling (TS) [44]. For the GP hyperparameter optimization, we use 100 Adam [19] iterations with a learning rate of 0.01.