Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions
Authors: Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the training error and the decision regions of trained network in Figure 4. The grid size in each case of Figure 4 has been manually chosen so that one can see clearly the connected/disconnected components in the decision regions. First, we observe that for two hidden units (n1 = 2), the network satisfies the condition of Theorem 3.10 and thus can only learn connected regions, which one can also clearly see in the figure, where one basically gets a linear separator. However, for three hidden units (n1 = 3), one can see that the network can produce disconnected decision regions, which shows that both our Theorems 3.10 and 3.11 are tight, in the sense that width d + 1 is already sufficient to produce disconnected components, whereas the results say that for width less than d + 1 the decision regions have to be connected. |
| Researcher Affiliation | Academia | 1Department of Mathematics and Computer Science, Saarland University, Germany 2University of T ubingen, Germany. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information for open-source code related to the methodology. |
| Open Datasets | Yes | We use a single image of digit 1 from the MNIST dataset to create a new artificial dataset... In Figure 7, we show another similar experiment on MNIST dataset, but now for all the 10 image classes. |
| Dataset Splits | No | The paper mentions '2000 training images' and discusses training error, but does not specify validation splits or percentages for the datasets used. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running its experiments. |
| Software Dependencies | No | The paper mentions methods like 'leaky Re LU', 'SGD', and 'cross-entropy loss', but does not specify any software packages or libraries with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.0') that were used in their experiments. |
| Experiment Setup | Yes | We then train this network by using SGD with momentum for 1000 epochs and learning rate 0.1 and reduce the it by a factor of 2 after every 50 epochs. |