Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Understanding Learning in Neural Networks with Linear Teachers
Authors: Roei Sarussi, Alon Brutzkus, Amir Globerson
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide empirical results that validate our theoretical analysis. We also provide empirical evaluation that con๏ฌrms that weight clustering indeed explains why approximate linear decision boundaries are learned. |
| Researcher Affiliation | Academia | 1The Blavatnik School of Computer Science, Tel Aviv University. Correspondence to: Alon Brutzkus <EMAIL>. |
| Pseudocode | No | The paper describes optimization algorithms like SGD and gradient flow conceptually and mathematically, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | A network is trained on Gaussian data and binary MNIST problems. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks) used in the experiments. |
| Experiment Setup | Yes | The network has 100 neurons, initialized from a Gaussian with standard deviation 0.001 for small initialization and 30 for large initialization. We consider the case where LS(W ) is minimized using SGD in epochs with a batch size of one and a learning rate ฮท. |