reproducibilityindex.ai

The Early Phase of Neural Network Training

Authors: Jonathan Frankle, David J. Schwab, Ari S. Morcos

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive measurements of the network state during these early iterations of training and leverage the framework of Frankle et al. (2019) to quantitatively probe the weight distribution and its reliance on various aspects of the dataset.
Researcher Affiliation	Collaboration	Jonathan Frankle MIT CSAIL David J. Schwab CUNY ITS Facebook AI Research Ari S. Morcos Facebook AI Research
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the methodology or a link to a code repository.
Open Datasets	Yes	Throughout this paper, we study ﬁve standard convolutional neural networks for CIFAR-10.
Dataset Splits	No	The paper mentions the use of CIFAR-10 and evaluation accuracy but does not explicitly provide specific training/validation/test dataset split percentages, absolute sample counts, or explicit references to predefined splits with citations for reproducibility within the text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	All networks follow the same training regime: we train with SGD for 160 epochs starting at learning rate 0.1 (momentum 0.9) and drop the learning rate by a factor of ten at epoch 80 and again at epoch 120. Training includes weight decay with weight 1e4. Data is augmented with normalization, random ﬂips, and random crops up to four pixels in any direction. Batch Size: 128 (for ResNet, WRN), 64 (for VGG-13).