Preconditioning for Scalable Gaussian Process Hyperparameter Optimization

Authors: Jonathan Wenger, Geoff Pleiss, Philipp Hennig, John Cunningham, Jacob Gardner

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical results enable provably efficient optimization of kernel hyperparameters, which we validate empirically on large-scale benchmark problems. There our approach accelerates training by up to an order of magnitude. 5. Experiments We validate our theoretical findings empirically via GP hyperparameter optimization on synthetic and benchmark datasets with and without preconditioning.
Researcher Affiliation Academia 1University of Tübingen 2Max Planck Institute for Intelligent Systems, Tübingen 3Columbia University 4University of Pennsylvania. Correspondence to: Jonathan Wenger <jonathan.wenger@uni-tuebingen.de>.
Pseudocode Yes Algorithm 1: log-Marginal Likelihood. Algorithm 2: Derivative of the log-Marginal Likelihood.
Open Source Code Yes An implementation of our method is available as part of GPYTORCH (Gardner et al., 2018). github.com/cornellius-gp/gpytorch
Open Datasets Yes We consider a one-dimensional synthetic dataset of n = 10,000 iid standard normal samples, as well as a range of UCI datasets (Dua & Graff, 2017) with training set sizes ranging from n = 12,449 to 326,155 (see Table 2).
Dataset Splits Yes Hyperparameters were optimized with L-BFGS using an Armijo-Wolfe line search and early stopping via a validation set.
Hardware Specification Yes All experiments were performed on single NVIDIA GPUs, a Ge Force RTX 2080 and Titan RTX, respectively.
Software Dependencies No The paper mentions GPYTORCH, L-BFGS, and Adam but does not provide specific version numbers for these software components. For example, "An implementation of our method is available as part of GPYTORCH (Gardner et al., 2018)" does not specify the GPYTORCH version.
Experiment Setup Yes We perform GP regression using an RBF and Matérn( 3/2) kernel with output scale o, lengthscales lj one per input dimension and noise σ2. Hyperparameters were optimized with L-BFGS using an Armijo-Wolfe line search and early stopping via a validation set. We use a partial Cholesky preconditioner throughout.