Gaussian Process Behaviour in Wide Deep Neural Networks
Authors: Alexander G. de G. Matthews, Jiri Hron, Mark Rowland, Richard E. Turner, Zoubin Ghahramani
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate convergence rates empirically, we use maximum mean discrepancy. We then exhibit situations where existing Bayesian deep networks are close to Gaussian processes in terms of the key quantities of interest. Further, we empirically study the distance between finite networks and their Gaussian process analogues by using maximum mean discrepancy (Gretton et al., 2012) as a distance measure. |
| Researcher Affiliation | Collaboration | Alexander G. de G. Matthews University of Cambridge am554@cam.ac.uk Jiri Hron University of Cambridge jh2084@cam.ac.uk Mark Rowland University of Cambridge mr504@cam.ac.uk Richard E. Turner University of Cambridge ret26@cam.ac.uk Zoubin Ghahramani University of Cambridge, Uber AI Labs zoubin@eng.cam.ac.uk |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | 1Code for the experiments in the paper can be found at https://github.com/ widedeepnetworks/widedeepnetworks |
| Open Datasets | No | The paper mentions using '10 standard normal input points in 4 dimensions' and 'a simple one dimensional problem and a two dimensional real valued embedding of the four data point XOR problem'. However, it does not provide specific access information (link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper mentions using 'train and test points' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In this experiment and all those that follow we take weight variance parameters ˆC(µ) w = 0.8 and bias variance Cb = 0.2. For the MCMC we used Hamiltonian Monte Carlo (HMC) (Neal, 2010) updates interleaved with elliptical slice sampling (Murray et al., 2010). We use rectified linear units and correct the variances to avoid a loss of prior variance as depth is increased. |