reproducibilityindex.ai

Linear Mode Connectivity and the Lottery Ticket Hypothesis

Authors: Jonathan Frankle, Gintare Karolina Dziugaite, Daniel Roy, Michael Carbin

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random data order and augmentation). We find that standard vision models become stable to SGD noise in this way early in training. From then on, the outcome of optimization is determined to a linearly connected region. We use this technique to study iterative magnitude pruning (IMP), the procedure used by work on the lottery ticket hypothesis to identify subnetworks that could have trained in isolation to full accuracy. We find that these subnetworks only reach full accuracy when they are stable to SGD noise, which either occurs at initialization for small-scale settings (MNIST) or early in training for large-scale settings (Res Net-50 and Inception-v3 on Image Net).
Researcher Affiliation	Collaboration	1MIT CSAIL 2Element AI 3University of Toronto 4Vector Institute.
Pseudocode	Yes	Algorithm 1 Compute instability of Wk with function f.
Open Source Code	No	The paper does not provide an explicit statement of open-source code availability for its methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We study image classification networks on MNIST, CIFAR-10, and Image Net as listed in Table 1.
Dataset Splits	No	The paper mentions using train and test sets for evaluation (e.g., 'test set instability'), but it does not provide explicit details about the dataset splits (e.g., specific percentages or sample counts for training, validation, and testing).
Hardware Specification	No	The paper mentions 'GPU resources' and 'TPU resources' from IBM and Google respectively, but does not provide specific hardware models (e.g., GPU/CPU models, memory details) for these resources.
Software Dependencies	No	The paper mentions using 'Tensor Flow Research Cloud' but does not specify any software names with version numbers for libraries, frameworks, or operating systems used in the experiments.
Experiment Setup	Yes	Table 1. Our networks and hyperparameters. Accuracies are the means and standard deviations across three initializations. Hyperparameters for Res Net-20 standard are from He et al. (2016). Hyperparameters for VGG-16 standard are from Liu et al. (2019). Hyperparameters for low, warmup, and Le Net are adapted from Frankle & Carbin (2019). Hyperparameters for Image Net networks are from Google s reference TPU code (Google, 2018).