reproducibilityindex.ai

Width and Depth Limits Commute in Residual Networks

Authors: Soufiane Hayou, Greg Yang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive simulations that show an excellent match with our theoretical findings.
Researcher Affiliation	Collaboration	1Department of Mathematics, National University of Singapore 2Microsoft Research AI.
Pseudocode	No	The paper describes mathematical derivations and theoretical concepts, but it does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	No	The paper conducts simulations on randomly generated inputs for theoretical validation, rather than using a publicly available or open dataset.
Dataset Splits	No	The paper conducts simulations to validate theoretical findings, but it does not specify dataset splits for training, validation, or testing, as it does not use a pre-existing dataset.
Hardware Specification	No	The paper focuses on theoretical analysis and simulations, but it does not specify any hardware details (e.g., GPU/CPU models, memory) used for conducting these experiments.
Software Dependencies	No	The paper mentions 'PDE solver (RK45 method, Fehlberg, 1968)' as an approximation method for theoretical prediction but does not specify any software libraries or their version numbers used for the simulations.
Experiment Setup	Yes	To empirically validate this finding, we show in Fig. 2 the histograms of the first neuron in the last layer (t = 1 in Theorem 1) for a randomly chosen input a and n, L {5, 50, 500}. We also perform a Kolmogorov Smirnov normality test and report the statistic (KS) and the p-value. As can be seen in Fig. 2, the histograms appear to fit the theoretical Gaussian distribution more closely as width and depth increase. The histogram is based on N = 10^4 simulations. In Fig. 5, we compare the empirical covariance qˆt with the theoretical prediction qt for (n, L) {5, 50, 500, 5000}. The average is calculated based on N = 100 simulations. The theoretical prediction qt is approximated with a PDE solver (RK45 method, Fehlberg, 1968) for t [0, 1] with a discretization step t =1e-4.