reproducibilityindex.ai

The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization

Authors: Mufan Li, Mihai Nica, Dan Roy

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using Monte Carlo simulations, we demonstrate that even basic properties of standard Res Net architectures are poorly captured by the Gaussian limit, but remarkably well captured by our log-Gaussian limit. To provide a better approximation, we study Re LU Res Nets in the inﬁnite-depth-and-width limit, where both depth and width tend to inﬁnity as their ratio, d/n, remains constant. In contrast to the Gaussian inﬁnite-width limit, we show theoretically that the network exhibits log-Gaussian behaviour at initialization in the inﬁnite-depth-and-width limit, with parameters depending on the ratio d/n. Using Monte Carlo simulations, we demonstrate that even basic properties of standard Res Net architectures are poorly captured by the Gaussian limit, but remarkably well captured by our log-Gaussian limit. Based on Monte Carlo simulations, we ﬁnd excellent agreement between our predictions and ﬁnite networks (see Figure 1).
Researcher Affiliation	Academia	Mufan (Bill) Li University of Toronto, Vector Institute Mihai Nica University of Guelph, Vector Institute Daniel M. Roy University of Toronto, Vector Institute Correspondence: mufan.li@mail.utoronto.ca; nicam@uoguelph.ca; daniel.roy@utoronto.ca.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks. It uses mathematical equations and descriptions of network architectures.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the methodology described is publicly available.
Open Datasets	No	The paper conducts Monte Carlo simulations to verify theoretical predictions about neural networks at initialization, rather than training models on traditional datasets. It does not mention using or providing access to any publicly available dataset for training purposes.
Dataset Splits	No	The paper focuses on theoretical limits and Monte Carlo simulations of network properties at initialization, not on training deep learning models on datasets. Therefore, there is no mention of dataset splits for training, validation, or testing.
Hardware Specification	No	The paper does not specify the hardware used for its Monte Carlo simulations (e.g., specific GPU/CPU models, memory details, or cloud resources).
Software Dependencies	No	The paper mentions various software tools in its references (JAX, PyTorch, NumPy, etc.) some with version numbers in their bibliographic entries. However, it does not explicitly state which of these were used as dependencies for their experiments with specific version numbers in the main text or a dedicated section describing the experimental setup's software environment.
Experiment Setup	No	The paper primarily analyzes the behavior of neural networks at initialization and through Monte Carlo simulations, rather than training deep learning models. While it defines network parameters (e.g., "All networks have n = 100, nin = nout = 10, α = λ = 1/2"), it does not provide specific experimental setup details such as training hyperparameters (learning rates, batch sizes, epochs), optimizers, or other system-level training configurations typically found in papers describing trained models.