2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution

Authors: Kai Liu, Haotong Qin, Yong Guo, Xin Yuan, Linghe Kong, Guihai Chen, Yulun Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on different bits and scaling factors, the performance of DOBI can reach the state-of-the-art (SOTA) while after stage two, our method surpasses existing PTQ in both metrics and visual effects.
Researcher Affiliation Academia Kai Liu1, Haotong Qin2, Yong Guo3, Xin Yuan4, Linghe Kong1 , Guihai Chen1, Yulun Zhang1 1Shanghai Jiao Tong University, 2ETH Zürich, 3Max Planck Institute for Informatics, 4Westlake University
Pseudocode Yes Algorithm 1: DOBI pipeline
Open Source Code Yes The code and models are available at https://github.com/Kai-Liu001/2DQuant.
Open Datasets Yes We use DF2K [40, 34] as the training data, which combines DIV2K [40] and Flickr2K [34], as utilized by most SR models.
Dataset Splits Yes We use DF2K [40, 34] as the training data, which combines DIV2K [40] and Flickr2K [34], as utilized by most SR models. During training, since we employ a distillation training method, we do not need to use the high-resolution parts of the DF2K images. For validation, we use the Set5 [2] as the validation set. After selecting the best model, we tested it on five commonly used benchmarks in the SR field: Set5 [2], Set14 [45], B100 [36], Urban100 [18], and Manga109 [37].
Hardware Specification Yes Our code is written with Python and Py Torch [38] and runs on an NVIDIA A800-80G GPU.
Software Dependencies No Our code is written with Python and Py Torch [38]. (No specific version numbers for Python or PyTorch are provided, nor for other libraries if used).
Experiment Setup Yes During DQC, we use the Adam [23] optimizer with a learning rate of 1 10 2, betas set to (0.9, 0.999), and a weight decay of 0. We employ Cosine Annealing [35] as the learning rate scheduler to stabilize the training process. Data augmentation is also performed. We randomly utilize rotation of 90 , 180 , and 270 and horizontal flips to augment the input image. The total iteration for training is 3,000 with batch size of 32.