Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-Offs by Selective Execution
Authors: Lanlan Liu, Jia Deng
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With extensive experiments of various D2NN architectures on image classification tasks, we demonstrate that D2NNs are general and flexible, and can effectively optimize accuracy-efficiency trade-offs.We perform extensive experiments to validate our D2NNs algorithms. We evaluate various D2NN architectures on several tasks. |
| Researcher Affiliation | Academia | Lanlan Liu, Jia Deng University of Michigan, Ann Arbor 2260 Hayward Street Ann Arbor, Michigan, 48109 |
| Pseudocode | No | The paper describes learning methods and inference steps but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We implement the D2NN framework in Torch.' but does not provide any explicit statement or link for public code release. |
| Open Datasets | Yes | We use the Labeled Faces in the Wild (Huang et al. 2007; Learned-Miller 2014) dataset.We use ILSVRC-10, a subset of the ILSVRC-65 (Deng et al. 2012). |
| Dataset Splits | Yes | We hold out 11k images for validation and 22k for testing.Each class has 500 training images, 50 validation images, and 150 test images. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'We implement the D2NN framework in Torch.' but does not provide version numbers for Torch or any other software dependencies. |
| Experiment Setup | Yes | During training we define the Q-learning reward as a linear combination of accuracy A and efficiency E (negative cost): r = λA + (1 λ)E where λ [0, 1].During training we also perform ϵ-greedy exploration instead of always choosing the action with the best Q value, we choose a random action with probability ϵ. The hyperparameter ϵ is initialized to 1 and decreases over time. |