Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks
Authors: Gauthier Gidel, Francis Bach, Simon Lacoste-Julien
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments4.1 Assump. 1 for Classification Datasets4.2 Linear AutoencoderIn Fig. 2, we plot the trace norm of W(t) and W(t)1W(t)2 as well as their respective reconstruction errors as a function of t the number of iterationsTable 1: Value of the quantities xy and x defined in (27). |
| Researcher Affiliation | Academia | Gauthier Gidel Mila & DIRO Universit e de Montr ealFrancis Bach INRIA & Ecole Normale Sup erieure PSL Research University, ParisSimon Lacoste-Julien Mila & DIRO Universit e de Montr eal |
| Pseudocode | No | No section or figure explicitly labeled 'Pseudocode' or 'Algorithm' was found. |
| Open Source Code | No | The paper does not provide any statement or link regarding the release of source code for the described methodology. |
| Open Datasets | Yes | MNIST [Le Cun et al., 2010], CIFAR-10 [Krizhevsky et al., 2014] and Image Net [Deng et al., 2009] |
| Dataset Splits | No | The paper mentions datasets used for experiments but does not provide specific details on training, validation, or test splits (e.g., percentages, sample counts, or cross-validation setup). |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running experiments, such as exact GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions). |
| Experiment Setup | Yes | In this experiment, we have p = d = 20, n = 1000, r = 5 and we generated synthetic data. First we generate a fixed matrix B Rd r such that, Bkl U([0, 1]), 1 k, l n. Then, for 1 i n, we sample xi Bzi + ϵi where zi N(0, D := diag(4, 2, 1, 1/2, 1/4)) and ϵi 10 3N(0, Id).If η < 1 2σ1 , η < 2 σi σi+1 σ2 i and η < σi σi+1 σ2 i+1 , for 1 i rxy 1.we initialize with W1(0) = U diag(e δ1, . . . , e δp)Q and W2(0) = Q 1 diag(e δ1, . . . , e δd)V |