The Benefits of Implicit Regularization from SGD in Least Squares Problems
Authors: Difan Zou, Jingfeng Wu, Vladimir Braverman, Quanquan Gu, Dean P. Foster, Sham Kakade
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The empirical observations are pretty consistent with our theoretical findings and again demonstrate the benefit of the implicit regularization of SGD. Experiments. We perform experiments on Gaussian least square problem. |
| Researcher Affiliation | Collaboration | Difan Zou University of California, Los Angeles knowzou@cs.ucla.edu Jingfeng Wu Johns Hopkins University uuujf@jhu.edu Vladimir Braverman Johns Hopkins University vova@cs.jhu.edu Quanquan Gu University of California, Los Angeles qgu@cs.ucla.edu Dean P. Foster Amazon dean@foster.net Sham M. Kakade University of Washington & Microsoft Research sham@cs.washington.edu |
| Pseudocode | No | The information is insufficient. The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The information is insufficient. The reproducibility checklist explicitly states 'N/A' for including code needed to reproduce the results, and no other statement or link for open-source code is provided. |
| Open Datasets | No | The information is insufficient. The paper describes generating synthetic data based on specified distributions and parameters (e.g., 'Gaussian least squares problems', 'one-hot data distribution') but does not refer to a pre-existing, publicly available dataset with concrete access information (link, DOI, formal citation). |
| Dataset Splits | No | The information is insufficient. The paper does not specify training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits). |
| Hardware Specification | No | The information is insufficient. The reproducibility checklist states 'N/A' for hardware details, and the paper does not specify any particular GPU or CPU models, memory, or other hardware used for experiments. |
| Software Dependencies | No | The information is insufficient. The paper does not provide specific software names with version numbers for reproducibility, such as programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | We perform experiments on Gaussian least square problem. ... The problem dimension is d = 200 and the variance of model noise is σ2 = 1. We consider 6 problem instances, which are the combinations of 2 different covariance matrices H: λi = i 1 and λi = i 2; and 3 different true model parameter vectors w : w [i] = 1, w [i] = i 1, and w [i] = i 10. ... where the hyperparameters (i.e., γ and λ) are fine-tuned to achieve the best performance. |