Deep Information Propagation
Authors: Samuel S. Schoenholz, Justin Gilmer, Surya Ganguli, Jascha Sohl-Dickstein
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To test this ansatz we train ensembles of deep, fully connected, feed-forward neural networks of varying depth on MNIST and CIFAR10, with and without dropout. Our results confirm that neural networks are trainable precisely when their depth is not much larger than ξc. |
| Researcher Affiliation | Collaboration | Samuel S. Schoenholz Google Brain Justin Gilmer Google Brain Surya Ganguli Stanford University Jascha Sohl-Dickstein Google Brain |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | To test this ansatz we train ensembles of deep, fully connected, feed-forward neural networks of varying depth on MNIST and CIFAR10, with and without dropout. |
| Dataset Splits | No | The paper mentions training on MNIST and CIFAR10, but does not explicitly provide specific dataset split information (percentages or counts) or reference predefined splits for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We train these networks using Stochastic Gradient Descent (SGD) and RMSProp on MNIST and CIFAR10. We use a learning rate of 10-3 for SGD when L <= 200, 10-4 for larger L, and 10-5 for RMSProp. These learning rates were selected by grid search between 10-6 and 10-2 in exponentially spaced steps of size 10. |