AutoQ: Automated Kernel-Wise Neural Network Quantization
Authors: Qian Lou, Feng Guo, Minje Kim, Lantao Liu, Lei Jiang.
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate Auto Q, we selected several CNN models including Res Net-18, Res Net-50, Squeeze Net V1 (Iandola et al., 2016) and Mobile Net V2 (Sandler et al., 2018). The CNN models are trained on Image Net including 1.26M training images and tested on 50K test images spanning 1K categories of objects. We evaluated the inference performance, energy consumption and FPGA area of the CNN models quantized by Auto Q on a Xilinx Zynq-7020 embedded FPGA. |
| Researcher Affiliation | Academia | {louqian, fengguo, minje, lantao, jiang60}@iu.edu Indiana University Bloomington |
| Pseudocode | No | The paper describes the working flow and mathematical formulations of Auto Q, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | No | The paper does not contain any statements about releasing code for its methodology, nor does it provide any links to a code repository. |
| Open Datasets | Yes | The CNN models are trained on Image Net including 1.26M training images and tested on 50K test images spanning 1K categories of objects. |
| Dataset Splits | No | The paper mentions training and testing images from ImageNet, but it does not explicitly provide details about a separate validation set, its size, or how it was used in relation to the training and testing phases (e.g., '80/10/10 split'). |
| Hardware Specification | Yes | We evaluated the inference performance, energy consumption and FPGA area of the CNN models quantized by Auto Q on a Xilinx Zynq-7020 embedded FPGA. |
| Software Dependencies | No | The paper specifies training parameters such as learning rates, batch size, and the use of stochastic gradient descent (SGD). However, it does not list specific software libraries or frameworks (e.g., PyTorch, TensorFlow) with their version numbers, which would be necessary for full reproducibility. |
| Experiment Setup | Yes | We use a fixed learning rate of 10^-4 for the actor network and 10^-3 for the critic network. Auto Q trains the networks with the batch size of 64 and the replay buffer size of 2000. Auto Q first explores 100 episodes with a constant noise, i.e., δa[Li,Kj ] = 0.5 for the LLC and δg[Li] = 0.5 for the HLC, and then exploits 300 episodes with exponentially decayed noise. finetune the quantized model for ten epochs to recover the accuracy using stochastic gradient descent (SGD) with a fixed learning rate of 10^-3 and momentum of 0.9. ... goalmax is the maximum average QBN for a layer and we set it to 8. ... actionmax is the maximum QBN for a kernel and we set it to 8. ... For resource-constrained applications, e.g., low-power drones, Auto Q sets ψacc = 1, ψl = 0, ψe = 0 and ψa = 0 ... For accuracy-guaranteed applications, e.g., fingerprint locks, Auto QB sets ψacc = 2, ψl < 1, ψe < 1 and ψa < 1... |