reproducibilityindex.ai

Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?

Authors: Samet Oymak, Mahdi Soltanolkotabi

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To verify our theoretical claims, we conducted experiments on MNIST classiﬁcation and low-rank matrix regression. To illustrate the tradeoffs between the loss function and the distance to the initial point, we deﬁne normalized misﬁt and normalized distance as follows. (Section 5, Numerical Experiments)
Researcher Affiliation	Academia	1Department of Electrical and Computer Engineering, University of California, Riverside 2Department of Electrical and Computer Engineering, University of Southern California.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	We consider MNIST digit classiﬁcation task and use a standard Le Net model (Le Cun et al., 1998) from Tensorﬂow (Abadi et al., 2016).
Dataset Splits	No	The paper mentions 'training' and 'test errors' in the context of MNIST experiments, but does not provide specific percentages or sample counts for training, validation, or test dataset splits. For synthetic low-rank regression, it only mentions varying sample size 'n'.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions using 'TensorFlow' and 'Adam' but does not specify their version numbers or any other software dependencies with specific versions.
Experiment Setup	Yes	Both experiments use Adam with learning rate 0.001 and batch size 100 for 1000 iterations. At each iteration, we record the normalized misﬁt and distance to obtain a misﬁt-distance trajectory similar to Figure 1.