reproducibilityindex.ai

GL-RG: Global-Local Representation Granularity for Video Captioning

Authors: Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the challenging MSR-VTT and MSVD datasets show that our DL-RG outperforms recent state-of-the-art methods by a significant margin.
Researcher Affiliation	Collaboration	Liqi Yan1,2,8 , Qifan Wang3 , Yiming Cui4 , Fuli Feng5 , Xiaojun Quan6 , Xiangyu Zhang7 and Dongfang Liu8 1Fudan University 2Westlake University 3Meta AI 4University of Florida 5University of Science and Technology of China 6Sun Yat-sen University 7Purdue University 8Rochester Institute of Technology
Pseudocode	No	The paper includes equations and architectural diagrams, but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code is available at https://github.com/ylqi/GL-RG.
Open Datasets	Yes	We evaluate our GL-RG on MSR-VTT dataset [Xu et al., 2016]. We also evaluate our GL-RG on the MSVD dataset [Chen and Dolan, 2011].
Dataset Splits	Yes	We follow the data split of 6513 videos for training, 497 videos for validation, and 2990 videos for testing. We split the dataset into a 1,200 training set, 100 validation set, and 670 testing set by the contiguous index number.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models, memory, or processing units.
Software Dependencies	No	The paper does not specify version numbers for any software dependencies or libraries used in the implementation, only mentioning the use of pre-trained models on certain datasets.
Experiment Setup	Yes	Our decoder is trained with the learning rate of 0.0003 in the seeding phase, and 0.0001 in the boosting phase. For each video, training is operated on 20 or 17 ground-truth captions for MSR-VTT or MSVD respectively.