reproducibilityindex.ai

Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions

Authors: Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the training error and the decision regions of trained network in Figure 4. The grid size in each case of Figure 4 has been manually chosen so that one can see clearly the connected/disconnected components in the decision regions. First, we observe that for two hidden units (n1 = 2), the network satisﬁes the condition of Theorem 3.10 and thus can only learn connected regions, which one can also clearly see in the ﬁgure, where one basically gets a linear separator. However, for three hidden units (n1 = 3), one can see that the network can produce disconnected decision regions, which shows that both our Theorems 3.10 and 3.11 are tight, in the sense that width d + 1 is already sufﬁcient to produce disconnected components, whereas the results say that for width less than d + 1 the decision regions have to be connected.
Researcher Affiliation	Academia	1Department of Mathematics and Computer Science, Saarland University, Germany 2University of T ubingen, Germany.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information for open-source code related to the methodology.
Open Datasets	Yes	We use a single image of digit 1 from the MNIST dataset to create a new artiﬁcial dataset... In Figure 7, we show another similar experiment on MNIST dataset, but now for all the 10 image classes.
Dataset Splits	No	The paper mentions '2000 training images' and discusses training error, but does not specify validation splits or percentages for the datasets used.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running its experiments.
Software Dependencies	No	The paper mentions methods like 'leaky Re LU', 'SGD', and 'cross-entropy loss', but does not specify any software packages or libraries with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.0') that were used in their experiments.
Experiment Setup	Yes	We then train this network by using SGD with momentum for 1000 epochs and learning rate 0.1 and reduce the it by a factor of 2 after every 50 epochs.