Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Dynamics of Deep Matrix Factorization Beyond the Edge of Stability
Authors: Avrajit Ghosh, Soo Min Kwon, Rongrong Wang, Saiprasad Ravishankar, Qing Qu
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments to support our theory, along with examples demonstrating how these phenomena occur in nonlinear networks and how they differ from those which have benign landscape such as in DLNs. (...) 4 EXPERIMENTAL RESULTS |
| Researcher Affiliation | Academia | Avrajit Ghosh1 , Soo Min Kwon2 , Rongrong Wang1, Saiprasad Ravishankar1, Qing Qu2 1 Michigan State University, 2 University of Michigan Equal contribution; Correspondence to EMAIL; EMAIL |
| Pseudocode | No | The paper describes the methodology and analyses using mathematical equations and textual explanations, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | For the regression task, we minimize the loss L(Θ) = G(Θ) yimage 2 2, where G(Θ) is a UNet parameterized by Θ, and yimage denotes one of the images in Figure 10b. (...) we train a 2-layer fully connected neural network on N labeled training images from the CIFAR-10 dataset using MSE loss and plot the sharpness in Figure 10c. (...) on a 5K subset of the MNIST dataset, following Cohen et al. (2021). (...) on a subsampled 20K set on MNIST and CIFAR-10. |
| Dataset Splits | No | The paper mentions using a '5K subset of the MNIST dataset' and 'a subsampled 20K set on MNIST and CIFAR-10' and 'N labeled training images from the CIFAR-10 dataset', but it does not provide specific training/test/validation splits (e.g., percentages or exact sample counts for each split). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers like SGD and architectures like UNet and MLP, but it does not provide specific ancillary software details, such as library names with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | Here, η > 2/β corresponds to the EOS regime... For the DLN, we consider a 3-layer network... with an initialization scale of α = 0.01. (...) we train a 2-layer fully connected neural network on N labeled training images from the CIFAR-10 dataset using MSE loss and plot the sharpness in Figure 10c. (...) a 3-layer MLP without bias terms for the weights, with each hidden layer consisting of 1000 units. The network is trained using MSE loss with a learning rate of η = 4, along with random weights scaled by α = 0.01 and full-batch gradient descent on a 5K subset of the MNIST dataset, following Cohen et al. (2021). (...) we consider a 4-layer MLP with Re LU activations with hidden layer size in each unit of 200 for classification on a subsampled 20K set on MNIST and CIFAR-10. |