Implicit Sparse Regularization: The Impact of Depth and Early Stopping
Authors: Jiangyuan Li, Thanh Nguyen, Chinmay Hegde, Ka Wai Wong
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a series of simulation experiments1 to further illuminate our theoretical findings. Our simulation setup is described as follows. The entries of X are sampled as i.i.d. Rademacher random variables and the entries of the noise vector are i.i.d. N(0, σ2) random variables. We let w? = γ1S. The values for the simulation parameters are: n = 500, p = 3000, k = 5, γ = 1, σ = 0.5 unless otherwise specified. For 2-plots each simulation is repeated 30 times, and the median 2 error is depicted. The shaded area indicates the region between 25th and 75th percentiles pointwisely. |
| Researcher Affiliation | Academia | Jiangyuan Li jiangyuanli@tamu.edu Texas A&M University Thanh V. Nguyen thanhng.cs@gmail.com Chinmay Hegde chinmay.h@nyu.edu New York University Raymond K. W. Wong raywong@tamu.edu Texas A&M University |
| Pseudocode | No | The paper describes the gradient descent update rule in mathematical form (equation 2) but does not present it as a formal pseudocode block or algorithm. |
| Open Source Code | Yes | The code is available on https://github.com/jiangyuan2li/Implicit-Sparse-Regularization. |
| Open Datasets | Yes | The entries of X are sampled as i.i.d. Rademacher random variables and the entries of the noise vector are i.i.d. N(0, σ2) random variables. We let w? = γ1S. ... and we defer the result on MNIST to Appendix E. |
| Dataset Splits | No | The paper provides simulation parameters (n, p, k, γ, σ) and mentions that 'each simulation is repeated 30 times,' but it does not specify explicit training, validation, or test splits for the simulated data or for the MNIST dataset mentioned. |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., CPU, GPU models, memory, or cloud instances). |
| Software Dependencies | No | The paper mentions that "The code is available on https://github.com/jiangyuan2li/Implicit-Sparse-Regularization." However, it does not specify any software dependencies with version numbers in the text (e.g., Python version, specific libraries like PyTorch or TensorFlow versions). |
| Experiment Setup | Yes | The values for the simulation parameters are: n = 500, p = 3000, k = 5, γ = 1, σ = 0.5 unless otherwise specified. For 2-plots each simulation is repeated 30 times, and the median 2 error is depicted. The shaded area indicates the region between 25th and 75th percentiles pointwisely. ... We choose different values of N to illustrate the convergence of the algorithm. ... We intentionally pick a relatively large N = 2 × 10−3 where the algorithm fails to converge for N = 2. With the same initialization, the recovery manifests as N increases (Figure 3). ... Note that for both Figures 1 and 4, we set n = 100 and p = 200. Since N would decrease quickly with N, which would cause the algorithm takes a large number of iterations to escape from the small region. We fix N = 10−5 instead of fixing for Figure 4. ... The initialization is N = 10−4 and the step size is = 10−3 for all N. |