Mixed Precision DNNs: All you need is a good parametrization

Authors: Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, Akira Nakamura

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We confirm our findings with experiments on CIFAR-10 and Image Net and we obtain mixed precision DNNs with learned quantization parameters, achieving state-of-the-art performance.
Researcher Affiliation Industry Stefan Uhlich , Lukas Mauch , Fabien Cardinaux , Kazuki Yoshiyama Javier Alonso García, Stephen Tiedemann, Thomas Kemp Sony Europe B.V., Germany firstname.lastname@sony.com Akira Nakamura Sony Corporate, Japan akira.b.nakamura@sony.com
Pseudocode Yes The following code gives our differentiable quantizer implementation in NNabla (Sony). The source code for reproducing our results will be published after the review process has been finished.
Open Source Code No The source code for reproducing our results will be published after the review process has been finished.
Open Datasets Yes We confirm our findings with experiments on CIFAR-10 and Image Net.
Dataset Splits Yes Fig. 4 shows the evolution of the training and validation error during training for the case of uniform quantization. The plots for power-of-two quantization can be found in the appendix (Fig. 10). We initialize this network from random parameters or from a pre-trained float network.
Hardware Specification Yes Each epoch takes about 2.5 min on a single GTX 1080 Ti.
Software Dependencies No The following code gives our differentiable quantizer implementation in NNabla (Sony).
Experiment Setup Yes The quantized DNNs are trained for 160 epochs, using SGD with momentum 0.9 and a learning rate schedule starting with 0.01 and reducing it by a factor of 10 after 80 and 120 epochs, respectively. We use random flips and crops for data augmentation.