DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks
Authors: Ren Ao, Zhang Tao, Wang Yuhao, Lin Sheng, Dong Peiyan, Chen Yen-kuang, Xie Yuan, Wang Yanzhi5495-5502
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results show that DARB can achieve 13 to 25 pruning ratio, which are 2.8 to 4.3 improvements than the state-of-the-art counterparts on multiple neural network models and tasks. |
| Researcher Affiliation | Collaboration | Alibaba DAMO Academy Northeastern University {t.zhang, yuhao.w, yk.chen, y.xie}@alibaba-inc.com, {lin.sheng, dong.pe, ren.ao}@husky.neu.edu, yanz.wang@northeastern.edu |
| Pseudocode | No | The paper describes the methods textually and with diagrams but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or a link indicating that the source code for the methodology is open-source or publicly available. |
| Open Datasets | Yes | For this task, a large size two-layer LSTM (Zaremba, Sutskever, and Vinyals 2014) is built to perform the word-level prediction for Penn Tree Bank (PTB) dataset (Marcus, Marcinkiewicz, and Santorini 1993)... This task is evaluated with TIMIT (Garofolo et al. 1990)... we train an Alex Net (Krizhevsky, Sutskever, and Hinton 2012b) with Image Net (Deng et al. 2009)... We also evaluate DARB in the convolutional layers of VGG-16 (Simonyan and Zisserman 2014) on CIFAR10 (Krizhevsky, Hinton, and others 2009). |
| Dataset Splits | Yes | The training data, validation data, and test data of PTB dataset has 929k, 73k, and 82k words, respectively. |
| Hardware Specification | No | The paper mentions synthesis in 'CMOS 40nm process' for decoders but does not provide specific hardware details (like GPU or CPU models) used for training and evaluating the neural network models. |
| Software Dependencies | No | The paper mentions using the 'ADMM-NN pruning framework (Ren et al. 2019)' but does not specify version numbers for any software dependencies like programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | The model is trained with 20 batches and 35 unrolling steps, which has the same configurations as the prior arts (Zaremba, Sutskever, and Vinyals 2014; Wen et al. 2017). The dropout configuration for DARB is (0.35, 0.75) in the ADMM regularization step and (0.35, 0.7) in the retrain step, where the former in the parentheses is the dropout for LSTM layers, and the latter is for other layers. |