The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes

Authors: Stanislav Fort, Adam Scherlis3574-3581

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We ran a large number of experiments to determine the nature of the objective landscape. We focused on fully-connected and convolutional networks. We used the MNIST (Le Cun and Cortes ) and CIFAR-10 (Krizhevsky 2009) image classification datasets. We explored a range of widths and depths of fully-connected networks to determine the stability of our results, and considered the Re LU and tanh non-linearities.
Researcher Affiliation Academia Stanislav Fort Physics Department / KIPAC Stanford University 382 Via Pueblo, Stanford, CA 94305 sfort1@stanford.edu Adam Scherlis Physics Department / SITP Stanford University 382 Via Pueblo, Stanford, CA 94305 scherlis@stanford.edu
Pseudocode No The paper describes methods and processes in narrative text and figures, but it does not include any explicitly labeled pseudocode blocks or algorithm listings.
Open Source Code No The paper does not provide any statements about releasing code, a link to a code repository, or mention of code in supplementary materials.
Open Datasets Yes We used the MNIST (Le Cun and Cortes ) and CIFAR-10 (Krizhevsky 2009) image classification datasets.
Dataset Splits No The paper refers to "Validation accuracy" and "Initial validation loss" in figure captions and discussions. However, it does not explicitly state the specific dataset split percentages or sample counts for the validation set (e.g., 80/10/10 split, or specific number of samples for validation).
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions "Adam optimizer" and "cross-entropy loss" but does not specify software versions (e.g., TensorFlow 2.x, PyTorch 1.x, Python 3.x) for reproducibility.
Experiment Setup Yes We used the cross-entropy loss and the Adam optimizer with the learning rate of 10 3. We explored a range of widths and depths of fully-connected networks to determine the stability of our results, and considered the Re LU and tanh non-linearities.