Approximate Inference Turns Deep Networks into Gaussian Processes
Authors: Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present empirical results where we visualize the feature-map obtained on benchmark datasets such as MNIST and CIFAR, and demonstrate their use for DNN hyperparameter tuning. The code to reproduce our results is available at https://github.com/team-approx-bayes/dnn2gp. |
| Researcher Affiliation | Academia | Mohammad Emtiyaz Khan RIKEN Center for AI Project Tokyo, Japan emtiyaz.khan@riken.jp Alexander Immer* EPFL Lausanne, Switzerland alexander.immer@epfl.ch Ehsan Abedi* EPFL Lausanne, Switzerland ehsan.abedi@epfl.ch Maciej Korzepa* Technical University of Denmark Kgs. Lyngby, Denmark mjko@dtu.dk |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. Algorithm descriptions (e.g., RMSprop, VON, VOGGN) are given in paragraph text and mathematical equations. |
| Open Source Code | Yes | The code to reproduce our results is available at https://github.com/team-approx-bayes/dnn2gp. |
| Open Datasets | Yes | We present empirical results where we visualize the feature-map obtained on benchmark datasets such as MNIST and CIFAR [...] We consider a version of the Snelson dataset [20] [...] We generate a synthetic regression dataset (N = 100; see Fig. 5) [...] Next, we discuss results for a real dataset: UCI Red Wine Quality (N = 1599) |
| Dataset Splits | No | The paper does not explicitly state specific training/validation/test dataset splits (e.g., percentages or sample counts). While it discusses hyperparameter tuning, which implies a validation set, the details of the splits are not provided. |
| Hardware Specification | No | The paper mentions using the 'RAIDEN computing system' in the acknowledgements but does not provide specific hardware details such as GPU/CPU models, memory, or other specifications used for the experiments. |
| Software Dependencies | No | The paper mentions algorithms like Adam and VOGN but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We use a single hidden-layer MLP with 32 units and sigmoidal transfer function. [...] For Laplace, we use Adam [11], and, for VI, we use VOGN [10]. [...] We consider Le Net-5 [12] [...] We fit the data by using a neural network with single hidden layer of 20 units and tanh nonlinearity. [...] We use an MLP with 2 hidden layers 20 units each and tanh transfer function. We consider tuning the regularizer δ, the noise-variance σ, and the DNN width. We use the Laplace approximation and tune one parameter at a time while keeping the others fixed (we use respectively σ = 0.64, δ = 30 and σ = 0.64, δ = 3, 1 hidden layer). |