reproducibilityindex.ai

What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us?

Authors: Tiffany J Vlaar, Jonathan Frankle

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we put inferences of this kind to the test, systematically evaluating how linear interpolation and final performance vary when altering the data, choice of initialization, and other optimizer and architecture design choices.
Researcher Affiliation	Collaboration	Tiffany Vlaar 1 Jonathan Frankle 2 1Department of Mathematics, University of Edinburgh, Edinburgh, United Kingdom 2Mosaic ML.
Pseudocode	No	The paper describes methods in prose and through equations (e.g., Eq. 1, Eq. 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about the release of its source code or links to a code repository.
Open Datasets	Yes	We focus on a Res Net-18 (He et al., 2016) architecture with batch normalization trained for 100 epochs on CIFAR-10 data (Krizhevsky & Hinton, 2009)
Dataset Splits	No	The paper discusses training on CIFAR-10 data and evaluates test accuracy but does not explicitly state the dataset splits for training, validation, and testing (e.g., percentages or counts for each split).
Hardware Specification	Yes	We perform all our experiments in Py Torch using NVIDIA DGX-1 GPUs and use standard random Py Torch initialization.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify a version number for the software dependency.
Experiment Setup	Yes	We focus on a Res Net-18 (He et al., 2016) architecture with batch normalization trained for 100 epochs on CIFAR-10 data (Krizhevsky & Hinton, 2009) using SGD with momentum (0.9) and weight decay (5e-4) using Py Torch (Paszke et al., 2017). We use initial learning rate h = 0.1 that drops by 10x at epochs 33 and 66. For pretrained settings, we use initial learning rate h = 0.001 that drops by 10x after 30 epochs.