Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

Authors: Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper we study the three critical components of the Lottery Ticket (LT) algorithm... Finally, we discover the existence of Supermasks, masks that can be applied to an untrained, randomly initialized network to produce a model with performance far better than chance (86% on MNIST, 41% on CIFAR-10). ... In this section and throughout the remainder of the paper, we follow the experimental framework from [5] and perform iterative pruning experiments on a 3-layer fully-connected network (FC) trained on MNIST [12] and on three convolutional neural networks (CNNs), Conv2, Conv4, and Conv6 ... trained on CIFAR-10 [11].
Researcher Affiliation Industry Hattie Zhou Uber hattie@uber.com Janice Lan Uber AI janlan@uber.com Rosanne Liu Uber AI rosanne@uber.com Jason Yosinski Uber AI yosinski@uber.com
Pseudocode No The paper describes the lottery ticket algorithm in a numbered list (steps 0-5) but it is presented as prose, not formatted as pseudocode or an algorithm block.
Open Source Code Yes We make our code available at https://github.com/uber-research/deconstructing-lottery-tickets.
Open Datasets Yes In this section and throughout the remainder of the paper, we follow the experimental framework from [5] and perform iterative pruning experiments on a 3-layer fully-connected network (FC) trained on MNIST [12] and on three convolutional neural networks (CNNs), Conv2, Conv4, and Conv6 ... trained on CIFAR-10 [11].
Dataset Splits No The paper mentions 'early stopping iteration' based on minimum validation loss, implying a validation set was used, but it does not specify the exact percentages or sample counts for training, validation, or test splits. It states: 'For more architecture and training details, see Section S1 in Supplementary Information.' which is outside the main text.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup No The paper states: 'For more architecture and training details, see Section S1 in Supplementary Information.' This indicates that specific experimental setup details, such as hyperparameters or system-level training settings, are not provided in the main text.