SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
Authors: Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For demonstrating the strength of SQuant, we evaluate the SQuant as well as four SOTA methods, DFQ (Nagel et al., 2019), Zero Q (Cai et al., 2020), DSG (Zhang et al., 2021; Qin et al., 2021), and GDFQ (Xu et al., 2020), with 5 different CNN models including Res Net-18 & 50 (He et al., 2016), Inception V3 (Szegedy et al., 2016), Squeeze Next (Gholami et al., 2018) and Shuffle Net (Zhang et al., 2018) on the golden standard dataset Image Net (Krizhevsky et al., 2012). |
| Researcher Affiliation | Collaboration | Cong Guo1,2, Yuxian Qiu1,2, Jingwen Leng1,2, , Xiaotian Gao3, Chen Zhang4, Yunxin Liu5, Fan Yang3, Yuhao Zhu6 & Minyi Guo1,2, 1 Shanghai Jiao Tong University, 2 Shanghai Qi Zhi Institute 3 Microsoft Research, 4 DAMO Academy, Alibaba Group 5 Institute for AI Industry Research (AIR), Tsinghua University, 6 University of Rochester |
| Pseudocode | Yes | Algorithm 1: Progressive SQuant Algorithm. Input: Weight tensor W of layer ℓ, scale factor s of layer ℓ. Output: Quantized weight tensor C of layer ℓ. ... Algorithm 2: SQuant Flip Algorithm. Input: Rounded/SQuanted Weight w; Weight perturbation p. Output: Updated Quantized Weight w. |
| Open Source Code | Yes | We have open-sourced the SQuant framework1. 1https://github.com/clevercool/SQuant |
| Open Datasets | Yes | on the golden standard dataset Image Net (Krizhevsky et al., 2012). |
| Dataset Splits | No | The paper evaluates on the Image Net dataset but does not explicitly state the training/test/validation dataset splits (e.g., specific percentages or sample counts) used for their experiments, nor does it reference predefined splits with citations for their specific setup. |
| Hardware Specification | Yes | All DFQ algorithms are implemented with Py Torch (Paszke et al., 2019) and evaluated on Nvidia GPU A100-40GB. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2019)' but does not specify a version number for PyTorch or any other software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | Unless otherwise stated, we employ both weight and activation quantization in all experiments. Also, uniform quantization grids are used in all experiments, and hyper-parameters, e.g., re = rk = 1.0 and rc = 0.5, for all SQuant experiments are the same. |