reproducibilityindex.ai

Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Authors: Salma Tarmoun, Guilherme Franca, Benjamin D Haeffele, Rene Vidal

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Numerical Experiments: Here we provide numerical evidence to our theoretical results. First, we generate a random matrix Y with Yij N(0, 1) and set m = 5, n = 10 and k = 50. We approximate the dynamics of gradient ﬂow for one-layer and two-layer linear models by using gradient descent with a step size η = 10 3 (smaller step sizes did not lead to a discernible change). We evaluate the reconstruction error Y X(t) F / Y F , where X(t) = U(t)V T (t), and compare the evolution of the singular values of X(t).
Researcher Affiliation	Academia	1Mathematical Institute for Data Science, Johns Hopkins University, 2Department of Applied Mathematics and Statistics, Johns Hopkins University, 3Computer Science Division, University of California, Berkeley, 4Department of Biomedical Engineering, Johns Hopkins University.
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	No statement regarding the release or availability of open-source code for the described methodology was found.
Open Datasets	No	The paper uses generated synthetic data, not a publicly accessible dataset with concrete access information. 'First, we generate a random matrix Y with Yij N(0, 1)' and 'We generated the matrices W and X with entries drawn from N(0, 1) and Y = φ(XW ) + ϵ where ϵ 10 3N(0, I).'
Dataset Splits	No	The paper uses generated synthetic data and does not explicitly mention or provide details for training, validation, or test dataset splits. It only states 'We train the two networks'.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU models, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper does not mention specific software dependencies with version numbers (e.g., programming languages or libraries with their versions).
Experiment Setup	Yes	We approximate the dynamics of gradient ﬂow for one-layer and two-layer linear models by using gradient descent with a step size η = 10 3 (smaller step sizes did not lead to a discernible change). We consider Gaussian initializations, i.e., U0 and V0 have entries N(0, σ2) where σ is varied to obtain different degrees of imbalance. ... We set η = 10 5, Y N(0, 1), m = 5, n = 10 and vary k. ... Initial weights are drawn from a normal distribution N(0, 10 1).