Training invariances and the low-rank phenomenon: beyond linear networks
Authors: Thien Le, Stefanie Jegelka
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we extend this theoretical result to the last few linear layers of the much wider class of nonlinear Re LU-activated feedforward networks containing fully-connected layers and skip connections. Similar to the linear case, the proof relies on specific local training invariances, sometimes referred to as alignment, which we show to hold for submatrices where neurons are stably-activated in all training examples, and it reflects empirical results in the literature. Our theoretical results offer explanations for empirical observations on more general architectures, and apply to the experiments in (Huh et al., 2021) for Res Net and CNNs. |
| Researcher Affiliation | Academia | Thien Le & Stefanie Jegelka Massachusetts Institute of Technology {thienle,stefje}@mit.edu |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not mention releasing any open-source code for the described methodology, nor does it provide any links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not describe the use of any specific dataset for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not include details on dataset splits (training, validation, test) for reproducibility, as it does not perform empirical evaluations. |
| Hardware Specification | No | The paper focuses on theoretical analysis and does not describe the hardware used for any experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify software dependencies with version numbers, as it does not involve empirical experiments. |
| Experiment Setup | No | The paper is theoretical and does not include details on experimental setup, such as hyperparameters or training configurations. |