Complexity of Linear Regions in Deep Networks
Authors: Boris Hanin, David Rolnick
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3. Experiments We empirically veriļ¬ed our theorems and further examined how linear regions of a network change during training. All experiments below were performed with fully-connected networks, initialized with He normal weights (i.i.d. with variance 2/fan-in) and biases drawn i.i.d. normal with variance 10 6 (to prevent collapse of regions at initialization, which occurs when all biases are uniquely zero). Training was performed on the vectorized MNIST (input dimension 784) using the Adam optimizer at learning rate 10 3. All networks attain test accuracy in the range 95 98%. |
| Researcher Affiliation | Collaboration | Boris Hanin * 1 David Rolnick * 2 *Equal contribution 1Department of Mathematics, Texas A&M University and Facebook AI Research, New York 2University of Pennsylvania. |
| Pseudocode | No | The paper describes methods and experiments in prose and mathematical notation but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about making its source code publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Training was performed on the vectorized MNIST (input dimension 784) |
| Dataset Splits | No | The paper mentions 'Training was performed on the vectorized MNIST' but does not specify the exact training, validation, or test dataset splits (e.g., percentages or sample counts) needed for reproduction. |
| Hardware Specification | No | The paper describes the experimental setup regarding networks and initialization (e.g., 'fully-connected networks', 'He normal weights', 'Adam optimizer'), but it does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions the use of 'Adam optimizer' and 'Re LU activation' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | All experiments below were performed with fully-connected networks, initialized with He normal weights (i.i.d. with variance 2/fan-in) and biases drawn i.i.d. normal with variance 10 6 (to prevent collapse of regions at initialization, which occurs when all biases are uniquely zero). Training was performed on the vectorized MNIST (input dimension 784) using the Adam optimizer at learning rate 10 3. |