SeqRank: Sequential Ranking of Salient Objects
Authors: Huankang Guan, Rynson W.H. Lau
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to show the superior performance of our model. Extensive experiments are conducted to confirm the effectiveness of our approach, and our model achieves new state-of-the-art results on the existing SOR benchmarks. We conduct experiments on the public SOR benchmarks, ASSR (Siris et al. 2020) and IRSR (Liu et al. 2021a). |
| Researcher Affiliation | Academia | Department of Computer Science, City University of Hong Kong |
| Pseudocode | No | The paper describes its methods textually and with diagrams (e.g., Figure 2, Figure 3, Figure 4) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/guanhuankang/Seq Rank. |
| Open Datasets | Yes | We conduct experiments on the public SOR benchmarks, ASSR (Siris et al. 2020) and IRSR (Liu et al. 2021a). |
| Dataset Splits | Yes | ASSR is constructed from MS-COCO (Lin et al. 2014) and SALICON (Jiang et al. 2015) and comprises 7646 images for training, 1436 images for validation and 2418 images for testing. IRSR consists of 6059 training images and 2929 testing images... |
| Hardware Specification | Yes | We develop Seq Rank using Detectron2 (Wu et al. 2019) and train it with 4 NVIDIA A100-SXM4-80GB. |
| Software Dependencies | No | The paper mentions 'Detectron2 (Wu et al. 2019)' and backbone networks like 'Res Net' and 'Swin Transformer', along with 'Adam W optimizer'. However, it does not provide specific version numbers for software dependencies like Python, PyTorch/TensorFlow, or CUDA. |
| Experiment Setup | Yes | We set N = 100, d = 256, L = 2 and P = H/32, where H, W are the resolution of the input image, which is set to 800x800. During training, suggested by (Cheng et al. 2022), we calculate â„“mask on K randomly sampled points instead of the whole mask to improve training efficiency and reduce training memory. K is set to 12544, i.e., 112x112 points. We trained Seq Rank for 30k iterations with a batch size of 32. We adopt Adam W optimizer with a weight decay of 1e-4. The initial learning rate is set to 1e-4 and is decayed to 1e-5 after 20k iterations. We use random flip for data augmentation. |