Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

Authors: Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu8697-8705

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our method can be implemented by common operands, bringing almost no memory and computation overhead. We show that our method outperforms a range of state-of-the-art methods on Image Net classification and it can be generalized to more challenging tasks like PASCAL VOC object detection.
Researcher Affiliation Collaboration Xinghcao Liu1, Mao Ye1, Dengyong Zhou2, Qiang Liu 1 1 The University of Texas at Austin 2 Google Brain
Pseudocode Yes Algorithm 1 Optimization of Problem 5
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository.
Open Datasets Yes We evaluate our method on two tasks, Image Net classification (Krizhevsky, Sutskever, and Hinton 2012) and PASCAL VOC object detection (Everingham et al. 2007).
Dataset Splits Yes We take 256 images from the training set as the calibration set. Calibration set is used to quantize activations and choose the channels to perform multipoint quantization.
Hardware Specification No The paper discusses hardware in general terms such as 'specialized hardware accelerators' and 'commodity hardware', but it does not specify any exact GPU models, CPU models, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper mentions 'PyTorch' as the source for pretrained models but does not provide specific version numbers for PyTorch or any other software dependencies, making it impossible to reproduce the software environment.
Experiment Setup Yes For all experiments, we set the maximal step size for grid search in Eq. 9 to η = 1 210. We take 256 images from the training set as the calibration set. Like previous works, the weights of the first and the last layer are always quantized to 8-bit (Nahshan et al. 2019; Li, Dong, and Wang 2019; Banner, Nahshan, and Soudry 2019).