Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
Authors: Zhaoyang Zhang, Wenqi Shao, Jinwei Gu, Xiaogang Wang, Ping Luo
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively compare DDQ with existing state-of-the-art methods and conduct multiple ablation studies on Image Net (Russakovsky et al., 2015) and CIFAR dataset (Krizhevsky et al., 2009) |
| Researcher Affiliation | Collaboration | 1The Chinese University of Hong Kong 2Sense Brain, Ltd 3Shanghai AI Lab 4Hong Kong University. |
| Pseudocode | Yes | Algorithm 1 Training procedure of DDQ |
| Open Source Code | No | The codes will be released. |
| Open Datasets | Yes | We extensively compare DDQ with existing state-of-the-art methods and conduct multiple ablation studies on Image Net (Russakovsky et al., 2015) and CIFAR dataset (Krizhevsky et al., 2009) |
| Dataset Splits | Yes | The reported validation accuracy is simulated with bq = 8, if no other states. |
| Hardware Specification | No | The paper mentions 'mobile DSPs' in Table 4 and 'ARM' as a target hardware, but does not specify the exact models or detailed specifications of the hardware used for running experiments (e.g., specific GPU/CPU models, memory). |
| Software Dependencies | No | Training DDQ can be simply implemented in existing platforms such as Py Torch and Tensorflow. |
| Experiment Setup | Yes | More importantly, DDQ is trained for 30 epochs, reducing the training time compared to most of the reported approaches that trained much longer (i.e. 90 or 120 epochs). For memory constraints, α is set to 0.02 empirically. We use learning rates 1e 8 towards { ˆgi}b i=1, ensuring sufficient training when precision decreasing. |