MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution

Authors: Kailu Wu, Chung-Kuei Lee, Kaisheng Ma

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments suggest that the proposed method results in a model with a competitive trade-off between accuracy and speed at a much lower memory footprint than other stateof-the-art lightweight approaches. Extensive experiments suggest that the proposed method results in a model with a competitive trade-off between accuracy and speed at a much lower memory footprint than other stateof-the-art lightweight approaches.
Researcher Affiliation Collaboration Kailu Wu 1 Chung-Kuei Lee 2 Kaisheng Ma 1 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Hi Silicon Technologies, Shanghai, China .
Pseudocode Yes Algorithm 1 Determination of M and Pseudo Student Features
Open Source Code No The paper does not provide a direct link or explicit statement about the public availability of its source code.
Open Datasets Yes We use the DIV2K (Agustsson & Timofte, 2017) dataset to train our network, which includes 800 pairs of LR and HR images and is widely used in image restoration tasks.
Dataset Splits Yes DIV2K is used as the training dataset. We randomly crop HR patches of size 192 192 from the HR images. LR patches are cropped from the corresponding LR images by the scale factor. Standard data augmentation (i.e. random rotation and horizontal flipping) are used as with existing works (Hui et al., 2019; Lee et al., 2020; Lim et al., 2017; Hui et al., 2018; Luo et al., 2020; Muqeet et al., 2020). The teacher network is directly trained with random initialization.
Hardware Specification Yes The maximum memory footprint and the average inference time for upscaling 2 on LR image of size 960 540 with full precision on Py Torch (Paszke et al., 2019) and Nvidia GTX 1080Ti are shown above.
Software Dependencies Yes The maximum memory footprint and the average inference time for upscaling 2 on LR image of size 960 540 with full precision on Py Torch (Paszke et al., 2019) and Nvidia GTX 1080Ti are shown above. For learning rate, we use a one-cycle learning rate scheduler (Smith & Topin, 2019) from Py Torch (Paszke et al., 2019) with maximum learning rate 2 10 4.
Experiment Setup Yes DIV2K is used as the training dataset. We randomly crop HR patches of size 192 192 from the HR images. LR patches are cropped from the corresponding LR images by the scale factor. Standard data augmentation (i.e. random rotation and horizontal flipping) are used as with existing works (Hui et al., 2019; Lee et al., 2020; Lim et al., 2017; Hui et al., 2018; Luo et al., 2020; Muqeet et al., 2020). The teacher network is directly trained with random initialization. For training plain student network with our framework, we use β = 3 in Algorithm 1 which is enough for the algorithm, check appendix for the proof. And we use C = 1000 in Figure 3, λ = 0.3 in equation 14. All super-resolution models in this paper is trained with a batch size of 16 and 106 iterations in total. We use the Adam (Kingma & Ba, 2015) with β1 = 0.9 and β2 = 0.999 as optimizer. For learning rate, we use a one-cycle learning rate scheduler (Smith & Topin, 2019) from Py Torch (Paszke et al., 2019) with maximum learning rate 2 10 4. All experiments using our framework are repeated 4 times with global seeds 233, 234, 235, and 236.