Deep Neural Networks as Point Estimates for Deep Gaussian Processes
Authors: Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani, Nicolas Durrande
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | These claims are supported by experimental results on regression and classification datasets. |
| Researcher Affiliation | Collaboration | Vincent Dutordoir University of Cambridge James Hensman Google Brain Mark van der Wilk Imperial College London Carl Henrik Ek University of Cambridge Zoubin Ghahramani University of Cambridge Google Brain Nicolas Durrande Work done while at Secondmind. |
| Pseudocode | No | The paper describes the mathematical formulations and steps of the method but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper references existing open-source libraries (GPflow, GPflux) that were likely used, but does not explicitly state that the source code for the methodology or experiments described in this specific paper is provided or link to a repository for it. |
| Open Datasets | Yes | We compare a series of models on a range of regression problems. ... Large scale image classification: For MNIST and Fashion-MNIST ... For CIFAR-10 |
| Dataset Splits | No | Figure 7: Root Mean Squared Error (RMSE) and Negative Log Predictive Density (NLPD) with 25% and 75% quantile error bars based on 5 splits. (While '5 splits' is mentioned, the specific train/validation/test percentages or exact sample counts for these splits across all datasets are not provided. The NN+TS baseline explicitly mentions a 'held-out validation set', but this is not generalized to all models or experiments.) |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, memory amounts, or detailed computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions software like 'TensorFlow' (used by GPflow) and 'GPflux' but does not provide specific version numbers for these or other key software components used in the experiments. |
| Experiment Setup | Yes | We use three-layered models with 128 inducing variables (or, equivalently, hidden units). In each layer, the number of output heads is equal to the input dimensionality of the data. The Activated DGP (ADGP) and neural network approaches (NN, NN+Dropout, NN Ensembles and NN+TS) use Softplus activation functions. The Dropout baseline [24] uses a rate of 0.1 during train and test. |