MemSR: Training Memory-efficient Lightweight Model for Image Super-Resolution
Authors: Kailu Wu, Chung-Kuei Lee, Kaisheng Ma
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments suggest that the proposed method results in a model with a competitive trade-off between accuracy and speed at a much lower memory footprint than other stateof-the-art lightweight approaches. Extensive experiments suggest that the proposed method results in a model with a competitive trade-off between accuracy and speed at a much lower memory footprint than other stateof-the-art lightweight approaches. |
| Researcher Affiliation | Collaboration | Kailu Wu 1 Chung-Kuei Lee 2 Kaisheng Ma 1 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China 2Hi Silicon Technologies, Shanghai, China . |
| Pseudocode | Yes | Algorithm 1 Determination of M and Pseudo Student Features |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the public availability of its source code. |
| Open Datasets | Yes | We use the DIV2K (Agustsson & Timofte, 2017) dataset to train our network, which includes 800 pairs of LR and HR images and is widely used in image restoration tasks. |
| Dataset Splits | Yes | DIV2K is used as the training dataset. We randomly crop HR patches of size 192 192 from the HR images. LR patches are cropped from the corresponding LR images by the scale factor. Standard data augmentation (i.e. random rotation and horizontal flipping) are used as with existing works (Hui et al., 2019; Lee et al., 2020; Lim et al., 2017; Hui et al., 2018; Luo et al., 2020; Muqeet et al., 2020). The teacher network is directly trained with random initialization. |
| Hardware Specification | Yes | The maximum memory footprint and the average inference time for upscaling 2 on LR image of size 960 540 with full precision on Py Torch (Paszke et al., 2019) and Nvidia GTX 1080Ti are shown above. |
| Software Dependencies | Yes | The maximum memory footprint and the average inference time for upscaling 2 on LR image of size 960 540 with full precision on Py Torch (Paszke et al., 2019) and Nvidia GTX 1080Ti are shown above. For learning rate, we use a one-cycle learning rate scheduler (Smith & Topin, 2019) from Py Torch (Paszke et al., 2019) with maximum learning rate 2 10 4. |
| Experiment Setup | Yes | DIV2K is used as the training dataset. We randomly crop HR patches of size 192 192 from the HR images. LR patches are cropped from the corresponding LR images by the scale factor. Standard data augmentation (i.e. random rotation and horizontal flipping) are used as with existing works (Hui et al., 2019; Lee et al., 2020; Lim et al., 2017; Hui et al., 2018; Luo et al., 2020; Muqeet et al., 2020). The teacher network is directly trained with random initialization. For training plain student network with our framework, we use β = 3 in Algorithm 1 which is enough for the algorithm, check appendix for the proof. And we use C = 1000 in Figure 3, λ = 0.3 in equation 14. All super-resolution models in this paper is trained with a batch size of 16 and 106 iterations in total. We use the Adam (Kingma & Ba, 2015) with β1 = 0.9 and β2 = 0.999 as optimizer. For learning rate, we use a one-cycle learning rate scheduler (Smith & Topin, 2019) from Py Torch (Paszke et al., 2019) with maximum learning rate 2 10 4. All experiments using our framework are repeated 4 times with global seeds 233, 234, 235, and 236. |