reproducibilityindex.ai

Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation

Authors: Tianyu Guo, Haowei Wang, Yiwei Ma, Jiayi Ji, Xiaoshuai Sun

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on PNG benchmark datasets reveal that our approach achieves state-of-the-art performance, significantly outperforming existing methods by a considerable margin and yielding a 3.9-point improvement in overall metrics.
Researcher Affiliation	Academia	Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China {guotianyu, wanghaowei, yiweima}@stu.xmu.edu.cn, jjyxmu@gmail.com, xssun@xmu.edu.cn
Pseudocode	No	The paper includes mathematical equations and descriptive text, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Our codes and results are available at our project webpage: https://github.com/Tianyu Go GO/XPNG.
Open Datasets	Yes	We trained and evaluated our model on the Panoptic Narrative Grounding (PNG) dataset (Gonz alez et al. 2021)
Dataset Splits	No	In total, the PNG dataset comprises 133,103 training images and 8,380 test images, accompanied by 875,073 and 56,531 segmentation annotations, respectively. No specific validation set size or split is provided.
Hardware Specification	Yes	All experiments are conducted on an A100 GPU with a batch size of 11.
Software Dependencies	No	The paper mentions using FPN, ResNet101, and BERT models, but does not provide specific version numbers for these or any other software dependencies or libraries.
Experiment Setup	Yes	Images are resized so that the short side is 800 pixels while maintaining the aspect ratio, and the long side is 1333 pixels. For language input, ... The maximum token length is set to 230. We employ the Adam optimizer with an initial learning rate of 1e 4, which is halved every two epochs after the tenth epoch. The learning rate for BERT is set to 1e 5. The number of iteration update stages is set to 3. ... with a batch size of 11.