reproducibilityindex.ai

Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation

Authors: Zhao Yang, Jiaqi Wang, Yansong Tang, Kai Chen, Hengshuang Zhao, Philip H.S. Torr

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the challenging datasets of Ref COCO, Ref COCO+, and G-Ref demonstrate its advantage with respect to the state-of-the-art methods.
Researcher Affiliation	Collaboration	1University of Oxford, 2Shanghai AI Laboratory, 3Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, 4The University of Hong Kong
Pseudocode	No	The paper includes schematic illustrations of the proposed method (e.g., Figure 3), but it does not contain a formal pseudocode block or algorithm section.
Open Source Code	No	The paper does not provide any explicit statement about releasing their source code, nor does it include a link to a code repository for their method.
Open Datasets	Yes	We evaluate our proposed method on the datasets of Ref COCO (Yu et al. 2016), Ref COCO+ (Yu et al. 2016), and G-Ref (Mao et al. 2016; Nagaraja, Morariu, and Davis 2016).
Dataset Splits	Yes	For each dataset, the model is trained on the training set for 40 epochs with batch size 32... Images are resized to 480 × 480 resolution... On the validation, test A, and test B subsets of Ref COCO...
Hardware Specification	Yes	We measure the inference time by averaging over 500 forward passes using batch size 1 at 480 × 480 input resolution on an NVIDIA Quadro RTX 8000.
Software Dependencies	No	The paper mentions key software components like 'BERT-base model from (Devlin et al. 2019)' and 'Swin-B model from (Liu et al. 2021b)' and 'Hugging Face (Wolf et al. 2020)', but it does not specify version numbers for general software libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We adopt an Adam W (Loshchilov and Hutter 2019) optimizer with initial learning rate 5e-5 and weight decay 1e2, and apply the poly learning rate scheduler (Chen et al. 2018). The default number of iterations (n in Sec. 3) is 3, for which the loss weights, λ1, λ2, and λ3, are 0.15, 0.15, 0.7, respectively. For each dataset, the model is trained on the training set for 40 epochs with batch size 32, where each object is sampled exactly once in an epoch (with one of its text annotations randomly sampled). Images are resized to 480 × 480 resolution and sentence lengths are capped at 20.