Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Authors: Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we train state-of-the-art visual understanding neural networks on Image Net-1K dataset, with Integer operations on General Purpose (GP) hardware. ... and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. |
| Researcher Affiliation | Industry | Dipankar Das , Naveen Mellempudi , Dheevatsa Mudigere , Dhiraj Kalamkar {dipankar.das,naveen.k.mellempudi, dheevatsa.mudigere,dhiraj.d.kalamkar}@intel.com Parallel Computing Lab Intel Labs, India Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul Parallel Computing Lab Intel Labs, India Evangelos Georganas, Alexander Heinecke, Pradeep Dubey Parallel Computing Lab Intel Labs, SC Jesus Corbal Product Architecture Group Intel, OR Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov Software Services Group Intel, OR |
| Pseudocode | Yes | Algorithm 1 Semantics of the QVNNI16 Instruction ... Algorithm 2 Example Forward Propagation Loop |
| Open Source Code | No | For the mixed precision DFP16 experiments we use a private fork of this branch, where we have added DFP16 data-type support. |
| Open Datasets | Yes | train state-of-the-art visual understanding neural networks on Image Net-1K dataset...Russakovsky et al. (2015)on the Image Net-1K dataset Deng et al. (2009) |
| Dataset Splits | Yes | We trained several CNNs for the Image Net-1K classification task using mixed precision DFP16... We use exactly the same batch-size and hyper-parameter configuration for both the baseline FP32 and DFP16 training runs (Table.1). |
| Hardware Specification | Yes | Both baseline and mixed precision DFP16 experiments are run on the newly introduced Intel R Xeon Phi TM Knights-Mill4. hardware...Intel R Xeon Phi TM Processor 7295 (codename Knights-Mill)5 |
| Software Dependencies | No | The paper mentions 'BVLC CAFFE framework' and 'Intel s MKL-DNN library' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We use exactly the same batch-size and hyper-parameter configuration for both the baseline FP32 and DFP16 training runs (Table.1). In both cases, the models are trained from scratch using synchronous SGD on multiple nodes. ... Table 1: Training configuration and Image Net-1K classification accuracy Models Batch-size / Epochs Baseline Mixed precision DFP16... |