Leveraging Inter-Layer Dependency for Post -Training Quantization
Authors: changbao wang, DanDan Zheng, Yuanliu Liu, Liang Li
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrates that NWQ outperforms prior state-of-the-art approaches by a large margin: 20.24% for the challenging configuration of Mobile Net V2 with 2 bits on Image Net, pushing extremely low-bit PTQ from feasibility to usability. |
| Researcher Affiliation | Industry | Changbao Wang Ant Technology Group Co., Ltd. changbao.wcb@antgroup.com Dandan Zheng Ant Technology Group Co., Ltd. yuandan.zdd@antgroup.com Yuanliu Liu Ant Technology Group Co., Ltd. yuanliu.lyl@antgroup.com Liang Li Ant Technology Group Co., Ltd. double.ll@antgroup.com |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | Our method is easy to implement based on the baseline codes specified in our paper, and the data is Image Net1K which is publicly available. |
| Open Datasets | Yes | We randomly sample 1024 images from Image Net train set and employ Cutmix [45] and Mixup[46] as data augmentation. The learning rates are 0.01 for round policy and 0.0004 for activation quantizer step size. We train for 20000 iterations with a mini-batch size of 32 in 8 Tesla V100 GPUs, taking 30 minutes for Res Net18, which is on par with BRECQ and QDROP. Our experiments are conducted on 5 architectures, including Res Net18(Res18), Mobile Net V2(MNV2), Reg Net-600MF(Reg600MF), Reg Net-3.2GF(Reg3.2GF) and Mnas Net(Mnas). Other settings remain same as QDROP[40] if not specified. |
| Dataset Splits | Yes | We report top-1 classification accuracy on the Image Net[9] validation set. Classical Calibration Set. Following previous works, we compare the performance on a small calibration set of 1024 images. For start, we achieve 0.2% 0.9% improvement even compared with strong baselines on W4A4, which are rather close to full precision accuracy. Scaled-up Calibration Set. To further explore the potential of NWQ, we scale up the calibration set by 10 and reproduce QDROP on the scaled-up calibration set, building a very strong baseline. |
| Hardware Specification | Yes | We train for 20000 iterations with a mini-batch size of 32 in 8 Tesla V100 GPUs |
| Software Dependencies | No | The paper mentions software used ('Our code is based on open-source codes BRECQ and QDrop') but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The learning rates are 0.01 for round policy and 0.0004 for activation quantizer step size. We train for 20000 iterations with a mini-batch size of 32 in 8 Tesla V100 GPUs, taking 30 minutes for Res Net18, which is on par with BRECQ and QDROP. Our experiments are conducted on 5 architectures, including Res Net18(Res18), Mobile Net V2(MNV2), Reg Net-600MF(Reg600MF), Reg Net-3.2GF(Reg3.2GF) and Mnas Net(Mnas). Other settings remain same as QDROP[40] if not specified. |