When Are Solutions Connected in Deep Networks?

Authors: Quynh N. Nguyen, Pierre Bréchet, Marco Mondelli

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Numerical Experiments We compare the losses along the path of Theorem 4.1 and the one in [24] which our theoretical analysis has improved upon.
Researcher Affiliation Academia Quynh Nguyen MPI-MIS, Germany quynh.nguyen@mis.mpg.de Pierre Bréchet MPI-MIS, Germany pierre.brechet@mis.mpg.de Marco Mondelli IST Austria marco.mondelli@ist.ac.at
Pseudocode No The paper describes procedural steps, particularly in the Proof of Theorem 4.1, but does not present any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The code for reproducing the results is available at https://github.com/Quynh-Nguyen/mode_connectivity
Open Datasets Yes Datasets and architectures. We consider MNIST [26] and CIFAR-10 [23] datasets.
Dataset Splits No The paper mentions using standard SGD for training but does not explicitly provide details about training/validation/test dataset splits, such as percentages, sample counts, or specific split methodologies.
Hardware Specification Yes All experiments were run on a single machine with a NVIDIA GeForce RTX 2080 Ti GPU and an Intel i7-8700K CPU.
Software Dependencies Yes Our code is written in Python using PyTorch v1.7.0.
Experiment Setup Yes We train each network by standard SGD with cross-entropy loss, batch size 100 and no explicit regularizers. ... The learning rate is set to 0.01 for the MNIST experiments and to 0.1 for the CIFAR-10 experiments. We use a step decay schedule for the learning rate, dividing it by 10 at epochs 50 and 75 (total 100 epochs).