SeqRank: Sequential Ranking of Salient Objects

Authors: Huankang Guan, Rynson W.H. Lau

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments to show the superior performance of our model. Extensive experiments are conducted to confirm the effectiveness of our approach, and our model achieves new state-of-the-art results on the existing SOR benchmarks. We conduct experiments on the public SOR benchmarks, ASSR (Siris et al. 2020) and IRSR (Liu et al. 2021a).
Researcher Affiliation Academia Department of Computer Science, City University of Hong Kong
Pseudocode No The paper describes its methods textually and with diagrams (e.g., Figure 2, Figure 3, Figure 4) but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/guanhuankang/Seq Rank.
Open Datasets Yes We conduct experiments on the public SOR benchmarks, ASSR (Siris et al. 2020) and IRSR (Liu et al. 2021a).
Dataset Splits Yes ASSR is constructed from MS-COCO (Lin et al. 2014) and SALICON (Jiang et al. 2015) and comprises 7646 images for training, 1436 images for validation and 2418 images for testing. IRSR consists of 6059 training images and 2929 testing images...
Hardware Specification Yes We develop Seq Rank using Detectron2 (Wu et al. 2019) and train it with 4 NVIDIA A100-SXM4-80GB.
Software Dependencies No The paper mentions 'Detectron2 (Wu et al. 2019)' and backbone networks like 'Res Net' and 'Swin Transformer', along with 'Adam W optimizer'. However, it does not provide specific version numbers for software dependencies like Python, PyTorch/TensorFlow, or CUDA.
Experiment Setup Yes We set N = 100, d = 256, L = 2 and P = H/32, where H, W are the resolution of the input image, which is set to 800x800. During training, suggested by (Cheng et al. 2022), we calculate â„“mask on K randomly sampled points instead of the whole mask to improve training efficiency and reduce training memory. K is set to 12544, i.e., 112x112 points. We trained Seq Rank for 30k iterations with a batch size of 32. We adopt Adam W optimizer with a weight decay of 1e-4. The initial learning rate is set to 1e-4 and is decayed to 1e-5 after 20k iterations. We use random flip for data augmentation.