reproducibilityindex.ai

Finite Versus Infinite Neural Networks: an Empirical Study

Authors: Jaehoon Lee, Samuel Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform a careful, thorough, and large scale empirical study of the correspondence between wide neural networks and kernel methods.
Researcher Affiliation	Industry	Google Brain {jaehlee, schsam, jpennin, adlam, xlc, romann, jaschasd}@google.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states that "All experiments use the Neural Tangents library [15]", and provides a URL for this library: "https://github.com/google/neural-tangents". However, this is a third-party library used by the authors, not the specific source code for the methodology or implementation described in this paper.
Open Datasets	Yes	we evaluated every intervention for every architecture and focused on a single dataset, CIFAR-10 [70]. However, to ensure robustness of our results across dataset, we evaluate several key claims on CIFAR-100 and Fashion-MNIST [71].
Dataset Splits	No	Figures 3 and 7 show "Validation MSE" and "Validation Accuracy" plots, indicating that a validation set was used. However, the paper does not explicitly provide specific split percentages, sample counts, or citations to predefined validation splits.
Hardware Specification	No	Typically this takes around 1200 GPU hours with double precision. This indicates the type of hardware (GPU) and usage, but does not provide specific model numbers or detailed specifications.
Software Dependencies	No	All experiments use the Neural Tangents library [15], built on top of JAX [69]. We acknowledge the Python community [127] for developing the core set of tools that enabled this work, including Num Py [128], Sci Py [129], Matplotlib [130], Pandas [131], Jupyter [132], JAX [133], Neural Tangents [15], Apache Beam [68], Tensorﬂow datasets [134] and Google Colaboratory [135]. While software is listed, no specific version numbers are provided for any of the dependencies.
Experiment Setup	Yes	We use MSE loss... In all cases we use Re LU nonlinearities with critical initialization with small bias variance (σ2 w = 2.0, σ2 b = 0.01). Except if otherwise stated, we consider FCNs with 3-layers of width 2048 and CNNs with 8-layers of 512 channels per layer. In the ﬁnite-width settings, the base case uses mini-batch gradient descent at a constant small learning rate.