Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
Authors: Jianghao Shen, Yue Wang, Pengfei Xu, Yonggan Fu, Zhangyang Wang, Yingyan Lin5700-5708
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. |
| Researcher Affiliation | Academia | Jianghao Shen,1,2 Yue Wang,1 Pengfei Xu,1 Yonggan Fu,1 Zhangyang Wang,2 Yingyan Lin1 1Rice University, 2Texas A & M University {nie, atlaswang}@tamu.edu, {yw68, px5, yf22, yingyan.lin}@rice.edu |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our source code and supplementary material are available at https://github.com/Torment123/DFS. |
| Open Datasets | Yes | Models and Datasets: We evaluate the DFS method using Res Net38 and Res Net74 as the backbone models on two datasets: CIFAR-10 and CIFAR-100. |
| Dataset Splits | No | The paper uses CIFAR-10 and CIFAR-100 datasets but does not explicitly state the training, validation, and test dataset splits with percentages, sample counts, or specific split methodologies. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Training Details: The training of DFS follows the twostep procedure as described in Section 3. For the first step, we set the initial learning rate as 0.1, and train the gating network with a total of 64000 iterations; the learning rate is reduced by 10 after the 32000-th iteration, and further reduced by 10 after the 48000-th iteration. The specified computation budget is set to 100%. The hyperparameters including the momentum, weight decaying factor, and batch size are set to be 0.9, 1e-4, and 128, respectively, and the absolute value of α in Equation (2) is set to 5e-6. |