reproducibilityindex.ai

Random Feature Expansions for Deep Gaussian Processes

Authors: Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We extensively showcase the scalability and performance of our proposal on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers. We extensively demonstrate the effectiveness of our proposal on a variety of regression and classiﬁcation problems by comparing it with DNNs and other state-of-the-art approaches to infer DGPs. We evaluate our model by comparing it against relevant alternatives for both regression and classiﬁcation, and assess its performance when applied to large-scale datasets.
Researcher Affiliation	Academia	1Department of Data Science, EURECOM, France 2School of Computer Science and Engineering, University of New South Wales, Australia.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions that a competitor's code is available ('Code obtained from: github.com/thangbui/deepGP_approxEP') but does not provide an explicit statement or link for the source code of its own proposed method.
Open Datasets	Yes	We use the same experimental set-up for both regression and classiﬁcation tasks using datasets from the UCI repository (Asuncion & Newman, 2007). We focus part of the experiments on large-scale problems, such as MNIST8M digit classiﬁcation and the AIRLINE dataset.
Dataset Splits	No	The paper mentions 'withheld test data' and that 'The results are averaged over 3 folds for every dataset,' which implies a splitting strategy. However, it does not explicitly provide specific percentages for training, validation, and test splits, nor does it clearly define a separate validation set split needed for reproduction.
Hardware Specification	Yes	The experiments were launched on single nodes of a cluster of Intel Xeon E5-2630 CPUs having 32 cores and 128GB RAM.
Software Dependencies	No	The paper states that the model was implemented in 'Tensor Flow (Abadi et al., 2015)' but does not specify a version number for TensorFlow or any other software dependencies, making it difficult to reproduce the exact software environment.
Experiment Setup	Yes	In the proposed DGP with an RBF kernel, we use 100 random features at every hidden layer to construct a multivariate GP with D(l) F = 3, and set the batch size to m = 200. We initially only use a single Monte Carlo sample, and halfway through the allocated optimization time, this is then increased to 100 samples. We employ the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.01, and in order to stabilize the optimization procedure, we ﬁx the parameters Θ for 12, 000 iterations, before jointly optimizing all parameters. For DGP-RBF and DGP-ARC, we use 500 random features, 50 GPs in the hidden layers, batch size of 1000, and Adam with a 0.001 learning rate. We construct a DNN conﬁgured with a dropout rate of 0.5 at each hidden layer in order to provide regularization during training.