Doubly Stochastic Variational Inference for Deep Gaussian Processes
Authors: Hugh Salimbeni, Marc Deisenroth
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide strong empirical evidence that our inference scheme for DGPs works well in practice in both classification and regression. We demonstrate through extensive experiments that our approach works well in practice. We provide results on benchmark regression and classification data problems, and also demonstrate the first DGP application to a dataset with a billion points. |
| Researcher Affiliation | Collaboration | Hugh Salimbeni Imperial College London and PROWLER.io hrs13@ic.ac.uk Marc Peter Deisenroth Imperial College London and PROWLER.io m.deisenroth@imperial.ac.uk |
| Pseudocode | No | The paper describes the algorithm steps in text but does not provide a formal pseudocode block or an algorithm block labeled as such. |
| Open Source Code | Yes | Our implementation is simple (< 200 lines), publicly available 6, and is integrated with GPflow (Matthews et al., 2017), an open-source GP framework built on top of Tensorflow (Abadi et al., 2015). 6https://github.com/ICL-SML/Doubly-Stochastic-DGP |
| Open Datasets | Yes | We use the UCI year dataset and the airline dataset, which has been commonly used by the large-scale GP community. For the airline dataset we take the first 700K points for training and next 100K for testing. We use a random 10% split for the year dataset. ... We apply the DGP with 2 and 3 layers to the MNIST multiclass classification problem. We use the robust-max multiclass likelihood (Hernández-Lobato et al., 2011) and use full unprocessed data with the standard training/test split of 60K/10K. |
| Dataset Splits | Yes | Following common practice (e.g. Hernández-Lobato and Adams, 2015) we use 20-fold cross validation with a 10% randomly selected held out test set and scale the inputs and outputs to zero mean and unit standard deviation within the training set (we restore the output scaling for evaluation). |
| Hardware Specification | No | The paper states that its approach "benefits from GPU acceleration" but does not specify any particular GPU models, CPUs, or other hardware details used for the experiments. |
| Software Dependencies | No | The paper mentions "GPflow: A Gaussian process library using Tensor Flow" and cites Abadi et al. (2015) for TensorFlow, but it does not specify version numbers for these software components. It only mentions "Tensorflow (Abadi et al., 2015)" and "GPflow (Matthews et al., 2017)", which are citations to papers, not specific version numbers of the software. |
| Experiment Setup | Yes | All our experiments were run with exactly the same hyperparameters and initializations. See the supplementary material for details. We use min(30, D0) for all the inner layers of our DGP models, where D0 is the input dimension, and the RBF kernel for all layers. |