Marginalised Gaussian Processes with Nested Sampling

Authors: Fergus Simpson, Vidhi Lalchand, Carl Edward Rasmussen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark against Hamiltonian Monte Carlo on time-series and two-dimensional regression tasks, finding that a principled approach to quantifying hyperparameter uncertainty substantially improves the quality of prediction intervals. In this section we present results from a series of experiments on synthetic data in one and two dimensions, as well as real world time series data.
Researcher Affiliation Collaboration Fergus Simpson Secondmind Cambridge, UK fergus@secondmind.ai Vidhi Lalchand University of Cambridge, UK vr308@cam.ac.uk Carl E. Rasmussen University of Cambridge, UK cer54@cam.ac.uk
Pseudocode Yes Algorithm 1: Nested Sampling for hyperparameter inference
Open Source Code Yes 4Code available at https://github.com/frgsimpson/nsampling
Open Datasets Yes The raw data is available at https://github.com/jamesrobertlloyd/gpss-research/tree/master/data/tsdlr-renamed
Dataset Splits No The evaluation was conducted with a 60/40 train/test split. The paper does not explicitly mention a separate validation split or its size.
Hardware Specification Yes Figure 6 depicts the wall clock time required for training during our time-series experiments, all of which utilised a single Nvidia GTX1070 GPU.
Software Dependencies No The paper mentions software packages like DYNESTY, GPflow, gpytorch, and pymc3 but does not provide specific version numbers for these dependencies.
Experiment Setup Yes ML-II uses five random restarts with a initialisation protocol tied to the training data. Following protocols from Wilson and Adams [2013], the SM weights (wi) were initialised to the standard deviation of the targets y scaled by the number of components (Q = 2). The SM bandwidths (σi) were initialised to points randomly drawn from a truncated Gaussian |N(0, max d(x, x )2)| where max d(x, x ) is the maximum distance between two training points and mean frequencies (µi) were drawn from Unif(0, νN) to bound against degenerate frequencies. HMC used Log Normal(0, 2) priors for all the hyperparameters. [...] ML-II was trained with Adam (learning rate=0.05) with 10 restarts and 2,000 iterations.