reproducibilityindex.ai

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

Authors: Shiwei Liu, Lu Yin, Decebal Constantin Mocanu, Mykola Pechenizkiy

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a series of experiments to support our conjecture and achieve the state-of-the-art sparse training performance with Res Net-50 on Image Net. More impressively, ITOP achieves dominant performance over the overparameterization-based sparse methods at extreme sparsities. When trained with Res Net-34 on CIFAR-100, ITOP can match the performance of the dense model at an extreme sparsity of 98%.
Researcher Affiliation	Academia	1Department of Mathematics and Computer Science, Eindhoven University of Technology, 5600 MB Eindhoven, the Netherlands 2Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, Enschede 7522NB,The Netherlands .
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	Yes	1https://github.com/Shiweiliuiiiiiii/ In-Time-Over-Parameterization
Open Datasets	Yes	We study Multi-layer Perceptron (MLP) on CIFAR-10, VGG-16 on CIFAR-10, Res Net-34 on CIFAR-100, and Res Net-50 on Image Net.
Dataset Splits	No	The paper mentions 'minimum validation loss function' in Section 3.1 but does not provide specific details on the dataset splits (e.g., percentages or sample counts) used for training, validation, or testing in the main text. It refers to Appendix A for experimental details, but these details are not explicitly present in the main body.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper states 'We use Py Torch as our library.' in Section 3.2, but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	We train MLP, VGG-16, and Res Net-34 with various T and report the test accuracy. ... We train MLP, VGG-16, and Res Net-34 for an extended training time with a large T. We safely choose T as 1500 for MLPs, 2000 for VGG-16, and 1000 for Res Net-34... In addition to the training time, the anchor points of the learning rate schedule are also scaled by the same factor. ... More precisely, we choose an update interval T of 4000, a batch size of 64, and an initial pruning rate of 0.5...