AutoQ: Automated Kernel-Wise Neural Network Quantization

Authors: Qian Lou, Feng Guo, Minje Kim, Lantao Liu, Lei Jiang.

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate Auto Q, we selected several CNN models including Res Net-18, Res Net-50, Squeeze Net V1 (Iandola et al., 2016) and Mobile Net V2 (Sandler et al., 2018). The CNN models are trained on Image Net including 1.26M training images and tested on 50K test images spanning 1K categories of objects. We evaluated the inference performance, energy consumption and FPGA area of the CNN models quantized by Auto Q on a Xilinx Zynq-7020 embedded FPGA.
Researcher Affiliation Academia {louqian, fengguo, minje, lantao, jiang60}@iu.edu Indiana University Bloomington
Pseudocode No The paper describes the working flow and mathematical formulations of Auto Q, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code No The paper does not contain any statements about releasing code for its methodology, nor does it provide any links to a code repository.
Open Datasets Yes The CNN models are trained on Image Net including 1.26M training images and tested on 50K test images spanning 1K categories of objects.
Dataset Splits No The paper mentions training and testing images from ImageNet, but it does not explicitly provide details about a separate validation set, its size, or how it was used in relation to the training and testing phases (e.g., '80/10/10 split').
Hardware Specification Yes We evaluated the inference performance, energy consumption and FPGA area of the CNN models quantized by Auto Q on a Xilinx Zynq-7020 embedded FPGA.
Software Dependencies No The paper specifies training parameters such as learning rates, batch size, and the use of stochastic gradient descent (SGD). However, it does not list specific software libraries or frameworks (e.g., PyTorch, TensorFlow) with their version numbers, which would be necessary for full reproducibility.
Experiment Setup Yes We use a fixed learning rate of 10^-4 for the actor network and 10^-3 for the critic network. Auto Q trains the networks with the batch size of 64 and the replay buffer size of 2000. Auto Q first explores 100 episodes with a constant noise, i.e., δa[Li,Kj ] = 0.5 for the LLC and δg[Li] = 0.5 for the HLC, and then exploits 300 episodes with exponentially decayed noise. finetune the quantized model for ten epochs to recover the accuracy using stochastic gradient descent (SGD) with a fixed learning rate of 10^-3 and momentum of 0.9. ... goalmax is the maximum average QBN for a layer and we set it to 8. ... actionmax is the maximum QBN for a kernel and we set it to 8. ... For resource-constrained applications, e.g., low-power drones, Auto Q sets ψacc = 1, ψl = 0, ψe = 0 and ψa = 0 ... For accuracy-guaranteed applications, e.g., fingerprint locks, Auto QB sets ψacc = 2, ψl < 1, ψe < 1 and ψa < 1...