Towards Understanding Learning in Neural Networks with Linear Teachers
Authors: Roei Sarussi, Alon Brutzkus, Amir Globerson
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical results that validate our theoretical analysis. We also provide empirical evaluation that confirms that weight clustering indeed explains why approximate linear decision boundaries are learned. |
| Researcher Affiliation | Academia | 1The Blavatnik School of Computer Science, Tel Aviv University. Correspondence to: Alon Brutzkus <alonbrutzkus@mail.tau.ac.il>. |
| Pseudocode | No | The paper describes optimization algorithms like SGD and gradient flow conceptually and mathematically, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | A network is trained on Gaussian data and binary MNIST problems. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks) used in the experiments. |
| Experiment Setup | Yes | The network has 100 neurons, initialized from a Gaussian with standard deviation 0.001 for small initialization and 30 for large initialization. We consider the case where LS(W ) is minimized using SGD in epochs with a batch size of one and a learning rate η. |