reproducibilityindex.ai

Hierarchical Attention Based Spatial-Temporal Graph-to-Sequence Learning for Grounded Video Description

Authors: Kai Shen, Lingfei Wu, Fangli Xu, Siliang Tang, Jun Xiao, Yueting Zhuang

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments demonstrate the effectiveness of our proposed method compared to state-of-the-art methods. and We conduct our experiments on the Grounded Activity Net Entities Dataset [Zhou et al., 2019] for evaluation.
Researcher Affiliation	Collaboration	Kai Shen1 , Lingfei Wu2 , Fangli Xu3 , Siliang Tang1 , Jun Xiao1 and Yueting Zhuang1 1Zhejiang University 2IBM Research 3Squirrel AI Learning {shenkai,siliang,junx,yzhuang}@zju.edu.cn, wuli@us.ibm.com, lili@yixue.us
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not provide any specific repository link or explicit statement about the release of its source code for the methodology described.
Open Datasets	Yes	We conduct our experiments on the Grounded Activity Net Entities Dataset [Zhou et al., 2019] for evaluation.
Dataset Splits	Yes	For a fair comparison, the data processing procedure is the same to [Zhou et al., 2019]. and Table 2: Results on Grounded Activity Net-Entities val set.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU or CPU models used for running its experiments.
Software Dependencies	No	The paper mentions software components like 'Faster R-CNN' and 'Res Ne Xt-101 backbone' but does not specify version numbers for any libraries, frameworks, or specific software dependencies needed for replication.
Experiment Setup	Yes	Hyperparameter settings. We set the threshold ϵ value in Eq.3 to 0.4, λa to 0.04, λb to 0.08, λc to 0.5. and number of heads m in Eq.2 to 5. The KNN hyper-parameter p {5, 10, 20, 30, 40} vary in the experiments as a results of model validation. The region proposal feature s original dimension d is 2048, the region proposals embedding dimension l is 1024, the word embedding size is 512, rnn hidden size r is 1024 and GCN s layer k is 3. The λ in Eq.4 is 0.8.