Self-Asymmetric Invertible Network for Compression-Aware Image Rescaling
Authors: Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the consistent improvements of SAIN across various image rescaling datasets in terms of both quantitative and qualitative evaluation under standard image compression formats (i.e., JPEG and Web P). |
| Researcher Affiliation | Industry | Jinhai Yang1*, Mengxi Guo1*, Shijie Zhao1 , Junlin Li2, Li Zhang2 1 Bytedance Inc., Shenzhen, China 2 Bytedance Inc., San Diego, CA, 92122 USA {yangjinhai.01, guomengxi.qoelab, zhaoshijie.0526, lijunlin.li, lizhang.idm}@bytedance.com |
| Pseudocode | No | The paper describes the model's components and operations using mathematical equations (e.g., Equations 1-6) and textual descriptions, but it does not include a clearly labeled pseudocode block or algorithm. |
| Open Source Code | Yes | Code is available at https://github.com/yang-jin-hai/SAIN. |
| Open Datasets | Yes | We adopt the 800 HR images from the widely-acknowledged DIV2K training set (Agustsson and Timofte 2017) to train our model. |
| Dataset Splits | Yes | We adopt the 800 HR images from the widely-acknowledged DIV2K training set (Agustsson and Timofte 2017) to train our model. Apart from the DIV2K validation set, we also evaluate our model on 4 standard benchmarks: Set5 (Bevilacqua et al. 2012), Set14 (Zeyde, Elad, and Protter 2010), BSD100 (Martin et al. 2001), and Urban100 (Huang, Singh, and Ahuja 2015). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or cloud computing instance specifications. |
| Software Dependencies | No | The paper mentions the use of Adam optimizer, L1, and L2 pixel loss, but it does not specify software versions for programming languages, libraries, or frameworks (e.g., Python, PyTorch/TensorFlow versions, CUDA versions). |
| Experiment Setup | Yes | For 2 and 4 image rescaling, we use a total of 8 and 16 Inv Blocks in total, and the downscaling module f has 5 and 10 E-Inv Blocks, respectively. The input images are cropped to 128 128 and augmented via random horizontal and vertical flips. We adopt Adam optimizer (Kingma and Ba 2014) with β1 = 0.9 and β2 = 0.999, and set the mini-batch size to 16. The model is trained for 500k iterations. The learning rate is initialized as 2 10 4 and reduced by half every 100k iterations. We use L1 pixel loss as the LR guidance loss Llr and L2 pixel loss as the HR reconstruction loss. To balance the losses in LR and HR spaces, we use λ1 = 1 and λ2 = λ3 = λ4 = λ5 = 1/4. The compression quality factor (QF) is empirically fixed at 75 during training. The Gaussian mixture for upscaling has K = 5 components. |