The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes
Authors: Stanislav Fort, Adam Scherlis3574-3581
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran a large number of experiments to determine the nature of the objective landscape. We focused on fully-connected and convolutional networks. We used the MNIST (Le Cun and Cortes ) and CIFAR-10 (Krizhevsky 2009) image classification datasets. We explored a range of widths and depths of fully-connected networks to determine the stability of our results, and considered the Re LU and tanh non-linearities. |
| Researcher Affiliation | Academia | Stanislav Fort Physics Department / KIPAC Stanford University 382 Via Pueblo, Stanford, CA 94305 sfort1@stanford.edu Adam Scherlis Physics Department / SITP Stanford University 382 Via Pueblo, Stanford, CA 94305 scherlis@stanford.edu |
| Pseudocode | No | The paper describes methods and processes in narrative text and figures, but it does not include any explicitly labeled pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper does not provide any statements about releasing code, a link to a code repository, or mention of code in supplementary materials. |
| Open Datasets | Yes | We used the MNIST (Le Cun and Cortes ) and CIFAR-10 (Krizhevsky 2009) image classification datasets. |
| Dataset Splits | No | The paper refers to "Validation accuracy" and "Initial validation loss" in figure captions and discussions. However, it does not explicitly state the specific dataset split percentages or sample counts for the validation set (e.g., 80/10/10 split, or specific number of samples for validation). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions "Adam optimizer" and "cross-entropy loss" but does not specify software versions (e.g., TensorFlow 2.x, PyTorch 1.x, Python 3.x) for reproducibility. |
| Experiment Setup | Yes | We used the cross-entropy loss and the Adam optimizer with the learning rate of 10 3. We explored a range of widths and depths of fully-connected networks to determine the stability of our results, and considered the Re LU and tanh non-linearities. |