Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Wide Neural Networks with Bottlenecks are Deep Gaussian Processes

Authors: Devanshu Agrawal, Theodore Papamarkou, Jacob Hinkle

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the utility of bottleneck NNGPs and their link to no-bottleneck NNGPs empirically, showing that restricting a hidden layer of an NNGP to a bottleneck can boost its model likelihood on three example datasets1. We also characterize the eﬀect of a bottleneck layer theoretically by analyzing an example multi-output single-bottleneck NNGP with rectiﬁed linear unit (Re LU) activation. We investigate this question on a simulated dataset that we call Rings and on two publicly available datasets Fisher s Iris data set (Anderson, 1935; Fisher, 1936) and the US Census Boston housing prices dataset (Harrison and Rubinfeld, 1978).
Researcher Affiliation	Academia	Devanshu Agrawal EMAIL The Bredesen Center University of Tennessee Knoxville, TN 37996-3394, USA Theodore Papamarkou EMAIL Computational Sciences and Engineering Division Oak Ridge National Lab Oak Ridge, TN 37830-8050, USA Jacob Hinkle EMAIL Computational Sciences and Engineering Division Oak Ridge National Lab Oak Ridge, TN 37830-8050, USA
Pseudocode	No	The paper primarily presents mathematical derivations and proofs, along with experimental results and theoretical analysis. It does not include any explicitly labeled pseudocode or algorithm blocks describing a method or procedure in a structured format.
Open Source Code	Yes	1. Code for our simulations and experiments is available at https://code.ornl.gov/d0a/bottleneck_nngp.
Open Datasets	Yes	We investigate this question on a simulated dataset that we call Rings and on two publicly available datasets Fisher s Iris data set (Anderson, 1935; Fisher, 1936) and the US Census Boston housing prices dataset (Harrison and Rubinfeld, 1978).
Dataset Splits	No	The paper describes the creation and labeling of the Rings dataset and mentions using the Iris and Boston House-Prices datasets, but it does not specify any training, validation, or test split percentages or sample counts for any of these datasets. For example, it does not state '80/10/10 split' or '40,000 training samples'.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as exact GPU or CPU models, processor types, or memory specifications. Generic terms like 'on a GPU' are not present either.
Software Dependencies	No	The paper mentions using the Adam optimizer and normalized ReLU activation, but it does not specify the versions of any programming languages, libraries, or frameworks (e.g., Python version, PyTorch version, TensorFlow version) that would be needed to reproduce the experiments.
Experiment Setup	Yes	We found the optimal variance hyperparameters iteratively through gradient descent. During the forward pass through the network in each iteration, we estimated the integral in Eq. (41) by drawing 100 IID Monte Carlo (MC) samples... We used the Adam optimizer (Kingma and Ba, 2014) to take advantage of the gradient noise generated by MC sampling during optimization; we set the initial learning rate to 0.1. ...if the new MLL was less than the value obtained from the initial forward pass of the iteration, then we multiplied the learning rate by 0.9. ...once complete, we evaluated Eq. (10) once more this time with 1000 MC samples to obtain the ﬁnal MLL estimate for each network architecture.