reproducibilityindex.ai

Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features

Authors: Greg Yang, Michael Santacroce, Edward J Hu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate it on CIFAR10 and Omniglot against NTK as well as ﬁnite networks, ﬁnding the π-limit outperform ﬁnite-width models trained normally (without projection) in both settings, closing the performance gap between ﬁniteand inﬁnite-width neural networks previously left by NTK.
Researcher Affiliation	Industry	Greg Yang Microsoft Michael Santacroce Microsoft Edward J. Hu Microsoft
Pseudocode	No	The paper includes mathematical theorems and descriptions of processes, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code for this work is available at github.com/santacml/pilim.
Open Datasets	Yes	Here we compare the performance of the relu π-limit on CIFAR10 (Krizhevsky, 2009) and Omniglot (Lake et al., 2015)...
Dataset Splits	Yes	In each epoch, we validate on 500 batches from the validation set.
Hardware Specification	Yes	All of our experiments are done on V100 GPUs.
Software Dependencies	No	The paper mentions 'relu activation' and 'half precision' but does not specify software dependencies like libraries or frameworks with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	We adopt a step learning rate schedule, with a learning rate drop of 0.15 at a certain milestone, which is a hyperparameter. We sweep over a variety of hyperparameters such as the learning rate, gradient clipping, weight decay, the LR drop milestone, etc, as well as width, r, and depth.