reproducibilityindex.ai

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Authors: Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry P. Vetrov, Andrew G. Wilson

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using FGE we can train high-performing ensembles in the time required to train a single model. We achieve improved performance compared to the recent state-of-the-art Snapshot Ensembles, on CIFAR-10, CIFAR-100, and Image Net.
Researcher Affiliation	Collaboration	1Samsung AI Center in Moscow, 2Skolkovo Institute of Science and Technology, 3Cornell University, 4Samsung-HSE Laboratory, National Research University Higher School of Economics, 5National Research University Higher School of Economics
Pseudocode	Yes	An outline of the algorithm is provided in the supplement.
Open Source Code	Yes	We release the code for reproducing the results in this paper at https://github.com/timgaripov/dnn-mode-connectivity
Open Datasets	Yes	We test VGG-16 [19], a 28-layer Wide Res Net with widening factor 10 [22] and a 158-layer Res Net [9] on CIFAR-10, and VGG-16, 164-layer Res Net-bottleneck [9] on CIFAR-100. Image Net ILSVRC-2012 [18] is a large-scale dataset containing 1.2 million training images and 50000 validation images divided into 1000 classes.
Dataset Splits	Yes	Image Net ILSVRC-2012 [18] is a large-scale dataset containing 1.2 million training images and 50000 validation images divided into 1000 classes.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	For FGE, with VGG we use cycle length c = 2 epochs, and a total of 22 models in the final ensemble. With Res Net and Wide Res Net we use c = 4 epochs, and the total number of models in the final ensemble is 12 for Wide Res Nets and 6 for Res Nets. For VGG we set the learning rates to α1 = 10 2, α2 = 5 10 4; for Res Net and Wide Res Net models we set α1 = 5 10 2, α2 = 5 10 4.