Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
The inductive bias of ReLU networks on orthogonally separable data
Authors: Mary Phuong, Christoph H Lampert
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we ο¬rst verify that the theoretical result (Theorem 1) is predictive of experimental outcomes, even when some technical assumptions are violated. Second, we present evidence that a similar result may hold for deeper networks as well, although this goes beyond Theorem 1. |
| Researcher Affiliation | Academia | Mary Phuong & Christoph H. Lampert IST Austria Am Campus 1, 3400 Klosterneuburg, Austria EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | We experiment on the MNIST dataset subsetted to two classes, the digit 0 and the digit 1. |
| Dataset Splits | No | The paper mentions training on datasets but does not provide specific details on how the data was split into training, validation, and test sets with percentages or sample counts. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam, SGD) and network types (ReLU, residual network) but does not provide specific version numbers for any software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | We train by stochastic gradient descent with batch size 50 and a learning rate of 0.1 for 500 epochs. At initialisation, we multiply all weights by 0.05. |