Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Authors: Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani, Nicolas Durrande

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These claims are supported by experimental results on regression and classification datasets.
Researcher Affiliation Collaboration Vincent Dutordoir University of Cambridge James Hensman Google Brain Mark van der Wilk Imperial College London Carl Henrik Ek University of Cambridge Zoubin Ghahramani University of Cambridge Google Brain Nicolas Durrande Work done while at Secondmind.
Pseudocode No The paper describes the mathematical formulations and steps of the method but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper references existing open-source libraries (GPflow, GPflux) that were likely used, but does not explicitly state that the source code for the methodology or experiments described in this specific paper is provided or link to a repository for it.
Open Datasets Yes We compare a series of models on a range of regression problems. ... Large scale image classification: For MNIST and Fashion-MNIST ... For CIFAR-10
Dataset Splits No Figure 7: Root Mean Squared Error (RMSE) and Negative Log Predictive Density (NLPD) with 25% and 75% quantile error bars based on 5 splits. (While '5 splits' is mentioned, the specific train/validation/test percentages or exact sample counts for these splits across all datasets are not provided. The NN+TS baseline explicitly mentions a 'held-out validation set', but this is not generalized to all models or experiments.)
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, memory amounts, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper mentions software like 'TensorFlow' (used by GPflow) and 'GPflux' but does not provide specific version numbers for these or other key software components used in the experiments.
Experiment Setup Yes We use three-layered models with 128 inducing variables (or, equivalently, hidden units). In each layer, the number of output heads is equal to the input dimensionality of the data. The Activated DGP (ADGP) and neural network approaches (NN, NN+Dropout, NN Ensembles and NN+TS) use Softplus activation functions. The Dropout baseline [24] uses a rate of 0.1 during train and test.