Synaptic Strength For Convolutional Neural Network
Authors: CHEN LIN, Zhao Zhong, Wu Wei, Junjie Yan
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiment results show the effectiveness of our approach. On CIFAR-10, we prune connections for various CNN models with up to 96% , which results in significant size reduction and computation saving. Further evaluation on Image Net demonstrates that synaptic pruning is able to discover efficient models which is competitive to state-of-the-art compact CNNs such as Mobile Net-V2 and Nas Net-Mobile. |
| Researcher Affiliation | Collaboration | Chen Lin Sense Time Research linchen@sensetime.com Zhao Zhong NLPR, CASIA University of Chinese Academy of Sciences zhao.zhong@nlpr.ia.ac.cn Wei Wu Sense Time Research wuwei@sensetime.com Junjie Yan Sense Time Research yanjunjie@sensetime.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | Dataset In order to evaluate the effectiveness of synapse pruning, we experiment with CIFAR-10 and Image Net. |
| Dataset Splits | Yes | CIFAR-10 dataset contains 50,000 train examples and 10,000 test examples. Image Net dataset is a large-scale image recognition benchmark which contains 1.2 million images for training and 50,000 for validation |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | All the networks are optimized with Stochastic Gradient Descent(SGD) with momentum 0.9 and weight decay 10^-4. For CIFAR-10 models, we train them with batch size 128 for 240 epochs in total. The initial learning rate is set to 0.1 and divided by 10 at the beginning of 120 and 180 epoch. For Image Net models, we train them with batch size 256 for 100 epochs in total. The initial learning rate is set to 0.1 and dived ed by 10 at the beginning of 30, 60 and 90 epoch. For CIFAR-10 models, we pick λ equals to 10^-4 for VGGNet, while 10^-5 and 5 * 10^-6 for Res Net-18 and Dense Net-40. For Image Net models, the sparsity constraint rate is set to 10^-6. |