reproducibilityindex.ai

NTK-SAP: Improving neural network pruning by aligning training dynamics

Authors: Yite Wang, Dawei Li, Ruoyu Sun

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our method achieves better performance than all baselines on multiple datasets. Our code is available at https://github. com/Yite Wang/NTK-SAP. Empirically, we show that NTK-SAP, as a data-agnostic foresight pruning method, achieves state-of-the-art performance in multiple settings.
Researcher Affiliation	Academia	1University of Illinois Urbana-Champaign, USA 2Shenzhen International Center for Industrial and Applied Mathematics, Shenzhen Research Institute of Big Data 3School of Data Science, The Chinese University of Hong Kong, Shenzhen, China
Pseudocode	Yes	Algorithm 1 Neural Tagent Kernel Spectrum-Aware Pruning (NTK-SAP)
Open Source Code	Yes	Our code is available at https://github. com/Yite Wang/NTK-SAP.
Open Datasets	Yes	We use CIFAR-10, CIFAR-100, Tiny-Image Net and Image Net. They do not contain personally identifiable information or offensive content.
Dataset Splits	No	The paper uses standard public datasets (CIFAR-10, CIFAR-100, Tiny-Image Net, Image Net) and refers to existing training protocols, but it does not explicitly provide percentages, sample counts, or specific references for how these datasets were split into training, validation, and test sets within the paper's text.
Hardware Specification	Yes	All of our experiments were run on NVIDIA V100s. Experiments on CIFAR-10/100 and Tiny Image Net datasets were run on a single GPU at a time. We use 2 and 4 GPUs for Res Net-18 and Res Net-50 in Image Net experiments, respectively.
Software Dependencies	No	The paper states, 'We use the torchvision implementations' and 'Our code is based on the original code of Synflow,' but it does not provide specific version numbers for these or any other software components.
Experiment Setup	Yes	Target datasets, models, and sparsity ratios. ... Details of training hyperparameters can be found in Appendix A. ... We prune networks using a batch size of 256 for CIFAR-10/100 and Tiny-Image Net datasets and a batch size of 128 for Image Net experiments. ... Table 2: Training hyper-parameters used in this work. Network, Dataset, Epochs, Batch, Optimizer, Momentum, LR, LR drop, Weight decay, Initialization.