Transformer-Based Selective Super-resolution for Efficient Image Refinement
Authors: Tianyi Zhang, Kishore Kasichainula, Yaoxin Zhuo, Baoxin Li, Jae-Sun Seo, Yu Cao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on three datasets demonstrate the efficiency and robust performance of our approach for super-resolution. |
| Researcher Affiliation | Academia | 1University of Minnesota 2Arizona State University 3Cornell Tech |
| Pseudocode | Yes | We outline the precise algorithm in Algorithm 1. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | Yes | We evaluate the effectiveness of our SSR model on three datasets, BDD100K (Yu et al. 2020), COCO 2017 (Lin et al. 2014) and MSRA10K (Cheng et al. 2015). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning for all datasets. While COCO2017 mentions training and validation images, it does not state how the authors applied this or similar splits across all datasets used in their experiments. |
| Hardware Specification | Yes | All experiments are conducted for 50 epochs on two Linux servers, each equipped with two NVIDIA A6000 GPUs. |
| Software Dependencies | No | The paper mentions 'YOLOv8' but does not provide a specific version number, nor does it list other software dependencies with version numbers. |
| Experiment Setup | Yes | The embedding dimension of the Tile Selection (TS) module is 96 while it s 180 for Tile Refinement (TR) module. We set the learning rates to 0.00001. Each TL utilizes a depth of 2, a window size of 7, and 3 attention heads. We employ a patch size of 2 for embedding, which corresponds to tile sizes of 16 16, 32 32, and 64 64, yielding tile labels of 4 4, 8 8, and 16 16 respectively. The weight parameter α for the loss function is set to 1. The number of RTB is 6. All experiments are conducted for 50 epochs |