NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization
Authors: Gen Li, Lu Yin, Jie Ji, Wei Niu, Minghai Qin, Bin Ren, Linke Guo, Shiwei Liu, Xiaolong Ma
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Systematical evaluations on training speed and system overhead are conducted on mobile devices, where the proposed Neur Rev framework consistently outperforms representative state-of-the-arts. We repeat software training experiments for 3 times and report the mean and standard deviation of the accuracy. 3 EXPERIMENTAL RESULTS |
| Researcher Affiliation | Academia | 1 Clemson University 2 Eindhoven University of Technology 3 University of Georgia 4 William & Mary 5 University of Aberdeen 6 University of Oxford |
| Pseudocode | Yes | Algorithm 1: Neur Rev for DST; Algorithm 2: Search and Awake |
| Open Source Code | Yes | Code available in https://github.com/coulsonlee/Neur Rev-ICLR2024. |
| Open Datasets | Yes | We implement our framework based on Res Net-32 (Wang et al., 2020) and VGG-19 (Simonyan & Zisserman, 2014), training and testing on CIFAR-10/100 (Krizhevsky, 2009). In order to show that Neur Rev also applies to large datasets, we train and test our framework on Image Net (Russakovsky et al., 2015) based on Res Net-34/50 (He et al., 2016). |
| Dataset Splits | No | The paper mentions training and testing datasets (CIFAR-10/100, Image Net) and provides detailed hyperparameters in Appendix A (Table A.1), but it does not explicitly specify a distinct validation dataset split (e.g., percentage or sample count) used for hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | The training speed results are obtained using a Samsung Galaxy S21 with Snapdragon 888 chipset. |
| Software Dependencies | No | The paper mentions using 'compiler code generation' to convert DNN computation graphs into 'static code (e.g., Open CL or C++)', but it does not list specific software dependencies (e.g., deep learning frameworks like PyTorch or TensorFlow, or other libraries) with version numbers used for the experiments. |
| Experiment Setup | Yes | We follow the setting of Gra SP (Wang et al., 2020) to set our training epochs equal to 160 for CIFAR-10/100. For CIFAR-10/100, we set a batch size of 64 and the initial learning rate to 0.1. The detailed experiment setting and hyper-parameter selections can be found in Appendix A. Table A.1: Hyperparameter settings. (listing specific values for training epochs, batch size, learning rate, momentum, etc.) |