Gaussian Process Behaviour in Wide Deep Neural Networks

Authors: Alexander G. de G. Matthews, Jiri Hron, Mark Rowland, Richard E. Turner, Zoubin Ghahramani

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate convergence rates empirically, we use maximum mean discrepancy. We then exhibit situations where existing Bayesian deep networks are close to Gaussian processes in terms of the key quantities of interest. Further, we empirically study the distance between finite networks and their Gaussian process analogues by using maximum mean discrepancy (Gretton et al., 2012) as a distance measure.
Researcher Affiliation Collaboration Alexander G. de G. Matthews University of Cambridge am554@cam.ac.uk Jiri Hron University of Cambridge jh2084@cam.ac.uk Mark Rowland University of Cambridge mr504@cam.ac.uk Richard E. Turner University of Cambridge ret26@cam.ac.uk Zoubin Ghahramani University of Cambridge, Uber AI Labs zoubin@eng.cam.ac.uk
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes 1Code for the experiments in the paper can be found at https://github.com/ widedeepnetworks/widedeepnetworks
Open Datasets No The paper mentions using '10 standard normal input points in 4 dimensions' and 'a simple one dimensional problem and a two dimensional real valued embedding of the four data point XOR problem'. However, it does not provide specific access information (link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset.
Dataset Splits No The paper mentions using 'train and test points' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for reproduction.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes In this experiment and all those that follow we take weight variance parameters ˆC(µ) w = 0.8 and bias variance Cb = 0.2. For the MCMC we used Hamiltonian Monte Carlo (HMC) (Neal, 2010) updates interleaved with elliptical slice sampling (Murray et al., 2010). We use rectified linear units and correct the variances to avoid a loss of prior variance as depth is increased.