2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
Authors: Kai Liu, Haotong Qin, Yong Guo, Xin Yuan, Linghe Kong, Guihai Chen, Yulun Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments on different bits and scaling factors, the performance of DOBI can reach the state-of-the-art (SOTA) while after stage two, our method surpasses existing PTQ in both metrics and visual effects. |
| Researcher Affiliation | Academia | Kai Liu1, Haotong Qin2, Yong Guo3, Xin Yuan4, Linghe Kong1 , Guihai Chen1, Yulun Zhang1 1Shanghai Jiao Tong University, 2ETH Zürich, 3Max Planck Institute for Informatics, 4Westlake University |
| Pseudocode | Yes | Algorithm 1: DOBI pipeline |
| Open Source Code | Yes | The code and models are available at https://github.com/Kai-Liu001/2DQuant. |
| Open Datasets | Yes | We use DF2K [40, 34] as the training data, which combines DIV2K [40] and Flickr2K [34], as utilized by most SR models. |
| Dataset Splits | Yes | We use DF2K [40, 34] as the training data, which combines DIV2K [40] and Flickr2K [34], as utilized by most SR models. During training, since we employ a distillation training method, we do not need to use the high-resolution parts of the DF2K images. For validation, we use the Set5 [2] as the validation set. After selecting the best model, we tested it on five commonly used benchmarks in the SR field: Set5 [2], Set14 [45], B100 [36], Urban100 [18], and Manga109 [37]. |
| Hardware Specification | Yes | Our code is written with Python and Py Torch [38] and runs on an NVIDIA A800-80G GPU. |
| Software Dependencies | No | Our code is written with Python and Py Torch [38]. (No specific version numbers for Python or PyTorch are provided, nor for other libraries if used). |
| Experiment Setup | Yes | During DQC, we use the Adam [23] optimizer with a learning rate of 1 10 2, betas set to (0.9, 0.999), and a weight decay of 0. We employ Cosine Annealing [35] as the learning rate scheduler to stabilize the training process. Data augmentation is also performed. We randomly utilize rotation of 90 , 180 , and 270 and horizontal flips to augment the input image. The total iteration for training is 3,000 with batch size of 32. |