Optimal Architectures in a Solvable Model of Deep Networks
Authors: Jonathan Kadmon, Haim Sompolinsky
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 2: Overlap dynamics. (A) Trajectory of overlaps across layers from eq (8)-(11) (solid lines) and simulations (circles). Dashed red line show the predicted separatrix m . The deviation from the theoretical prediction near the separatrix are due to final size effects of the simulations ( = 0.4, f = 0.1). (B) Basin of attraction for two values of f as a function of . Line show theoretical prediction and shaded area simulations. (C) Convergence time (number of layers) of the m = 1 attractor. Near the unstable fixed point (dashed vertical lines) convergence time diverges and rapidly decreases for larger initial conditions, m0 > m . In figure 4, two networks were trained as autoencoders on a set of templates composed of 3-digit numbers (See experimental procedures in the supplementary material). Both networks have the same number of neurons. In the first, all processing neurons are placed in a single wide layer, while in the other neurons were divided into 10 equally-sized layers. As the theory predicts, the deep structure is able to reproduce the original templates for a wide range of initial noise, while the single layer typically reduces the noise but fails to reproduce the original image. |
| Researcher Affiliation | Academia | Jonathan Kadmon The Racah Institute of Physics and ELSC The Hebrew University, Israel jonathan.kadmon@mail.huji.ac.il Haim Sompolinsky The Racah Institute of Physics and ELSC The Hebrew University, Israel and Center for Brain Science Harvard University |
| Pseudocode | No | The paper describes the mathematical model and equations (8)-(11) but does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | In figure 4, two networks were trained as autoencoders on a set of templates composed of 3-digit numbers (See experimental procedures in the supplementary material). |
| Open Datasets | Yes | In figure 4, two networks were trained as autoencoders on a set of templates composed of 3-digit numbers (See experimental procedures in the supplementary material). Input data was prepared using the MNIST handwritten digit database. |
| Dataset Splits | No | Input data was prepared using the MNIST handwritten digit database. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments or simulations. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their respective versions) that would be needed to replicate the experiments. |
| Experiment Setup | No | In figure 4, two networks were trained as autoencoders on a set of templates composed of 3-digit numbers (See experimental procedures in the supplementary material). |