Large Scale Structure of Neural Network Loss Landscapes

Authors: Stanislav Fort, Stanislaw Jastrzebski

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose and experimentally verify a unified phenomenological model of the loss landscape that incorporates many of them.
Researcher Affiliation Collaboration Stanislav Fort Google Research Zurich, Switzerland Stanislaw Jastrzebski New York University New York, United States
Pseudocode No The paper describes experimental procedures and calculations (e.g., Section 3.2 for surrogate loss, Section 4.2 for finding connectors) step-by-step in paragraph form, but it does not present them as a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not provide any specific links to source code repositories or explicitly state that the code for the described methodology is publicly available or in supplementary materials.
Open Datasets Yes We use Simple CNN model (...) and run experiments on the CIFAR-10 dataset. (...) and used MNIST, Fashion MNIST, CIFAR-10 and CIFAR-100 datasets.
Dataset Splits No The paper mentions using well-known datasets like CIFAR-10 and MNIST but does not specify the exact percentages or counts for training, validation, or test splits. It does not reference standard splits with specific citations or provide details on how the data was partitioned for reproducibility.
Hardware Specification No The paper states that experiments were run "in Tensor Flow" but does not provide any specific details about the hardware used, such as GPU models, CPU types, or cloud computing instances with their specifications.
Software Dependencies No The paper mentions using "Tensor Flow" and the "Adam optimizer" but does not specify any version numbers for these or any other software libraries or programming languages used, which is necessary for reproducibility.
Experiment Setup Yes Unless otherwise noted (as in Figure 6), we ran training with a constant learning rate with the Adam optimizer.