The inductive bias of ReLU networks on orthogonally separable data
Authors: Mary Phuong, Christoph H Lampert
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first verify that the theoretical result (Theorem 1) is predictive of experimental outcomes, even when some technical assumptions are violated. Second, we present evidence that a similar result may hold for deeper networks as well, although this goes beyond Theorem 1. |
| Researcher Affiliation | Academia | Mary Phuong & Christoph H. Lampert IST Austria Am Campus 1, 3400 Klosterneuburg, Austria {bphuong,chl}@ist.ac.at |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | We experiment on the MNIST dataset subsetted to two classes, the digit 0 and the digit 1. |
| Dataset Splits | No | The paper mentions training on datasets but does not provide specific details on how the data was split into training, validation, and test sets with percentages or sample counts. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam, SGD) and network types (ReLU, residual network) but does not provide specific version numbers for any software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | We train by stochastic gradient descent with batch size 50 and a learning rate of 0.1 for 500 epochs. At initialisation, we multiply all weights by 0.05. |