NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization

Authors: Gen Li, Lu Yin, Jie Ji, Wei Niu, Minghai Qin, Bin Ren, Linke Guo, Shiwei Liu, Xiaolong Ma

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Systematical evaluations on training speed and system overhead are conducted on mobile devices, where the proposed Neur Rev framework consistently outperforms representative state-of-the-arts. We repeat software training experiments for 3 times and report the mean and standard deviation of the accuracy. 3 EXPERIMENTAL RESULTS
Researcher Affiliation Academia 1 Clemson University 2 Eindhoven University of Technology 3 University of Georgia 4 William & Mary 5 University of Aberdeen 6 University of Oxford
Pseudocode Yes Algorithm 1: Neur Rev for DST; Algorithm 2: Search and Awake
Open Source Code Yes Code available in https://github.com/coulsonlee/Neur Rev-ICLR2024.
Open Datasets Yes We implement our framework based on Res Net-32 (Wang et al., 2020) and VGG-19 (Simonyan & Zisserman, 2014), training and testing on CIFAR-10/100 (Krizhevsky, 2009). In order to show that Neur Rev also applies to large datasets, we train and test our framework on Image Net (Russakovsky et al., 2015) based on Res Net-34/50 (He et al., 2016).
Dataset Splits No The paper mentions training and testing datasets (CIFAR-10/100, Image Net) and provides detailed hyperparameters in Appendix A (Table A.1), but it does not explicitly specify a distinct validation dataset split (e.g., percentage or sample count) used for hyperparameter tuning or early stopping.
Hardware Specification Yes The training speed results are obtained using a Samsung Galaxy S21 with Snapdragon 888 chipset.
Software Dependencies No The paper mentions using 'compiler code generation' to convert DNN computation graphs into 'static code (e.g., Open CL or C++)', but it does not list specific software dependencies (e.g., deep learning frameworks like PyTorch or TensorFlow, or other libraries) with version numbers used for the experiments.
Experiment Setup Yes We follow the setting of Gra SP (Wang et al., 2020) to set our training epochs equal to 160 for CIFAR-10/100. For CIFAR-10/100, we set a batch size of 64 and the initial learning rate to 0.1. The detailed experiment setting and hyper-parameter selections can be found in Appendix A. Table A.1: Hyperparameter settings. (listing specific values for training epochs, batch size, learning rate, momentum, etc.)