reproducibilityindex.ai

What training reveals about neural network complexity

Authors: Andreas Loukas, Marinos Poiitis, Stefanie Jegelka

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test our ﬁndings in the context of two tasks: Task 1. Regression of a sinusoidal function with increasing frequency... Task 2. CIFAR classiﬁcation under label corruption... In agreement with previous studies [4, 6, 3, 2], Figure 2 shows that training slows down as the complexity of the ﬁtted function increases. Figures 2b and 2e depict the per-epoch bias trajectory...
Researcher Affiliation	Academia	Andreas Loukas EPFL andreas.loukas@epfl.ch Marinos Poiitis Aristotle University of Thessaloniki mpoiitis@csd.auth.gr Stefanie Jegelka MIT stefje@mit.edu
Pseudocode	No	No structured pseudocode or algorithm blocks are present in the paper.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code availability for the described methodology.
Open Datasets	Yes	Task 2. CIFAR classiﬁcation under label corruption. In our second experiment, we trained a convolutional neural network (CNN) to classify 10000 images from the dog and airplane classes of CIFAR10 [74].
Dataset Splits	No	The paper mentions training on '100 randomly generated training points' for Task 1 and '10000 images from the dog and airplane classes of CIFAR10' for Task 2, but does not provide specific train/validation/test dataset split information (percentages, counts, or references to predefined splits).
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory) to run its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers needed to replicate the experiment.
Experiment Setup	Yes	We trained an MLP with 5 layers consisting entirely of Re LU activations and with the 1st layer weights being identity. We repeated the experiment 10 times, each time training the network with SGD using a learning rate of 0.001 and an MSE loss until it had ﬁt the sinusoidal function at 100 randomly generated training points. We set the ﬁrst layer identically with the regression experiment. We repeated the experiment 8 times, each time training the network with SGD using a BCE loss and a learning rate of 0.0025. f (t) be a depth d NN with Re LU activations being trained with SGD, a BCE loss and 1/2-Dropout.