PTMQ: Post-training Multi-Bit Quantization of Neural Networks

Authors: Ke Xu, Zhongcheng Li, Shanshan Wang, Xingyi Zhang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that PTMQ achieves comparable performance to existing state-of-the-art post-training quantization methods, while optimizing it speeds up by 100 compared to recent multi-bit quantization works. Code can be available at https://github.com/xuke225/PTMQ.
Researcher Affiliation Academia Ke Xu1,2, Zhongcheng Li2, Shanshan Wang1*, Xingyi Zhang1,3* 1Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University 2School of Artificial Intelligence, Anhui University, Hefei, China 3School of Computer Science and Technology, Anhui University, Hefei, China {xuke,wang.shanshan}@ahu.edu.cn lizhongcheng@stu.ahu.edu.cn xyzhanghust@gmail.com
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code can be available at https://github.com/xuke225/PTMQ.
Open Datasets Yes We assess the performance of the proposed PTMQ scheme on various CNN-based architectures (Res Net (He et al. 2016), Mobile Net V2 (Sandler et al. 2018), Reg Net (Radosavovic et al. 2020)) and transformer-based architectures (Vi T (Dosovitskiy et al. 2021), Dei T (Touvron et al. 2021)) on Image Net (Russakovsky et al. 2014) dataset.
Dataset Splits No The paper mentions using 'calibration data' but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) or reference predefined standard splits for reproducibility.
Hardware Specification Yes The time measurement is carried out with NVIDIA 3090.
Software Dependencies No The paper mentions using other methods like QDrop and PTQ4ViT, but it does not provide specific version numbers for its own software dependencies such as Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup No The paper describes the overall optimization process and components like MFM and GD-Loss, but it does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations in the main text.