Sharing Key Semantics in Transformer Makes Efficient Image Restoration
Authors: Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita Cucchiara, Luc V Gool, Ming-Hsuan Yang, Nicu Sebe
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across 6 IR tasks confirm the proposed Seman IR s state-of-the-art performance, quantitatively and qualitatively showcasing advancements. |
| Researcher Affiliation | Collaboration | Bin Ren1,2,3 Yawei Li4 Jingyun Liang4 Rakesh Ranjan5 Mengyuan Liu6 Rita Cucchiara7 Luc Van Gool3 Ming-Hsuan Yang8 Nicu Sebe2 1University of Pisa 2University of Trento 3INSAIT, Sofia University 4ETH Zürich 5Meta Reality Labs 6State Key Laboratory of General Artificial Intelligence, Peking University, Shenzhen Graduate School 7University of Modena and Reggio Emilia 8University of California, Merced |
| Pseudocode | Yes | Algorithm 1 Key-Semantic Transformer Stage (i.e., Seman IR Stage) |
| Open Source Code | Yes | The visual results, code, and trained models are available at https://github.com/Amazingren/Seman IR. |
| Open Datasets | Yes | The training datasets: DIV2K [1], Flickr2K [51], and WED [57]. The test datasets: Classic5 [22], LIVE1 [75], Urban100 [30], BSD500 [2]. |
| Dataset Splits | No | The paper lists training and testing datasets but does not explicitly provide details about validation dataset splits (e.g., percentages, counts, or specific methods for creating validation sets) for its experiments. |
| Hardware Specification | Yes | Each experiments are conducted on 4 NVIDIA Tesla V100 32G GPUs. |
| Software Dependencies | No | The paper mentions optimizers (Adam, AdamW) and loss functions (smooth L1, VGG, Charbonnier, L1) and hints at PyTorch (torch.gather(), torch-mask) and Triton, but does not specify version numbers for these software dependencies (e.g., PyTorch 1.x, CUDA 11.x). |
| Experiment Setup | Yes | Batch Size and Patch Size. We keep the similar batch size as other comparison methods, i.e., (Batch size = 16, Patch Size = 64) for JPEG CAR, denoising, demosaicking, and SR. (Batch Size = 32, Patch Size = 16) for IR in AWC. (Batch Size = 8, Patch Size = 192) for deblurring. Learning Rate Schedule. For all the IR tasks, similar to other comparison methods, we set the initial learning rate to 2 10 4, and then the half-decay is adopted during the training. Note that the training iteration for JPEG CAR, denoising, demosaicking, and SR is set to 1M. For IR in AWC and debluriing, it is set to 750K. |