When Are Solutions Connected in Deep Networks?
Authors: Quynh N. Nguyen, Pierre Bréchet, Marco Mondelli
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Numerical Experiments We compare the losses along the path of Theorem 4.1 and the one in [24] which our theoretical analysis has improved upon. |
| Researcher Affiliation | Academia | Quynh Nguyen MPI-MIS, Germany quynh.nguyen@mis.mpg.de Pierre Bréchet MPI-MIS, Germany pierre.brechet@mis.mpg.de Marco Mondelli IST Austria marco.mondelli@ist.ac.at |
| Pseudocode | No | The paper describes procedural steps, particularly in the Proof of Theorem 4.1, but does not present any structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code for reproducing the results is available at https://github.com/Quynh-Nguyen/mode_connectivity |
| Open Datasets | Yes | Datasets and architectures. We consider MNIST [26] and CIFAR-10 [23] datasets. |
| Dataset Splits | No | The paper mentions using standard SGD for training but does not explicitly provide details about training/validation/test dataset splits, such as percentages, sample counts, or specific split methodologies. |
| Hardware Specification | Yes | All experiments were run on a single machine with a NVIDIA GeForce RTX 2080 Ti GPU and an Intel i7-8700K CPU. |
| Software Dependencies | Yes | Our code is written in Python using PyTorch v1.7.0. |
| Experiment Setup | Yes | We train each network by standard SGD with cross-entropy loss, batch size 100 and no explicit regularizers. ... The learning rate is set to 0.01 for the MNIST experiments and to 0.1 for the CIFAR-10 experiments. We use a step decay schedule for the learning rate, dividing it by 10 at epochs 50 and 75 (total 100 epochs). |