DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices
Authors: Dawei Li, Xiaolong Wang, Deguang Kong
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As observed in the experiment, Deep Rebirth achieves more than 3x speed-up and 2.5x run-time memory saving on Goog Le Net with only 0.4% drop on top-5 accuracy in Image Net. Furthermore, by combining with other model compression techniques, Deep Rebirth offers an average of 106.3ms inference time on the CPU of Samsung Galaxy S5 with 86.5% top-5 accuracy, 14% faster than Squeeze Net which only has a top-5 accuracy of 80.5%. |
| Researcher Affiliation | Industry | Dawei Li,1 Xiaolong Wang,1 Deguang Kong 1Samsung Research America, Mountain View, CA {xiaolong.w, dawei.l}@samsung.com, {doogkong}@gmail.com |
| Pseudocode | No | The paper does not include any pseudocode or clearly labeled algorithm blocks. It describes methods textually and with mathematical formulas. |
| Open Source Code | No | The paper does not provide any statement about making its source code publicly available. |
| Open Datasets | Yes | We evaluate the accuracy loss in contrast to original ones after performing the accelerated models. The accuracy changing along with the optimization steps conducted on Image Net ILSVRC-2012 validation dataset are listed in Table 3. |
| Dataset Splits | No | The paper mentions using "Image Net ILSVRC-2012 validation dataset" for accuracy evaluation, which is typically considered the test set for reporting final performance. It specifies training parameters like "base learning rate for the regenerated layer as 0.01 (the rest layers are 0.001)", "batch size of 32", and "40,000 as the step size" but does not explicitly detail the training/validation/test splits as percentages or sample counts. |
| Hardware Specification | Yes | We evaluate the speed-up on other popular processors besides Galaxy S5, including (1) Moto E: a low-end mobile ARM CPU, (2) Samsung Galaxy S6: a high-end mobile ARM CPU, (3) Macbook Pro: an Intel x86 CPU, and (4) Titan X: a powerful server GPU. |
| Software Dependencies | No | Our implementation is based on Caffe (Jia et al. 2014) deep learning framework, and we compile it using Android NDK for mobile evaluation. Open BLAS (Xianyi, Qian, and Chothia 2014) is used for efficient linear algebra calculations. The paper names software components but does not provide specific version numbers for them (e.g., Caffe version, Android NDK version, OpenBLAS version). |
| Experiment Setup | Yes | During the whole optimization procedure of model training, we set the base learning rate for the regenerated layer as 0.01 (the rest layers are 0.001). We apply stochastic gradient descent training method (Bottou 2012) to learn the parameters with a batch size of 32. During our training phase, we set 40,000 as the step size together with 0.1 for gamma value and 0.9 for momentum parameter. |