Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
Authors: Alnur Ali, Edgar Dobriban, Ryan Tibshirani
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 5 gives numerical examples supporting our theory. We generated the data matrix according to X = Σ1/2W, where the entries of W were i.i.d. following a normal distribution. ... Figure 4 plots the risk of ridge regression, discrete-time SGD (2), and Theorem 2. ... averaged over 30 draws of y (the underlying coefficients were drawn from a normal distribution, and scaled so the signalto-noise ratio was roughly 1). |
| Researcher Affiliation | Academia | Alnur Ali 1 Edgar Dobriban 2 Ryan J. Tibshirani 3 1Stanford University 2University of Pennsylvania 3Carnegie Mellon University. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. |
| Open Datasets | No | We generated the data matrix according to X = Σ1/2W, where the entries of W were i.i.d. following a normal distribution. We allow for correlations between the features, setting the diagonal entries of the predictor covariance Σ to 1, and the off-diagonals to 0.5. |
| Dataset Splits | No | The paper does not provide specific dataset split information needed to reproduce the data partitioning. It only mentions the total data size (n=100, p=500). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details needed to replicate the experiment. |
| Experiment Setup | Yes | We set ϵ = 2.2548e-4, following Lemma 5. ... Below, we present results for n = 100, p = 500, and m = 20. |