reproducibilityindex.ai

Weakly Supervised Dense Event Captioning in Videos

Authors: Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu, Junzhou Huang

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results are provided to demonstrate the ability of our model on both dense event captioning and sentence localization in videos.
Researcher Affiliation	Collaboration	1 Tsinghua University, Beijing, China; 2 Tencent AI Lab. ; 3 MIT-IBM Watson AI Lab; 4 Microsoft Research Asia, Beijing, China;
Pseudocode	No	The paper describes the methods in text and uses mathematical formulations, but it does not include any pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	Details about training are provided in the Supplementary materials and our Github repository.
Open Datasets	Yes	We conduct experiments on the Activity Net Captions[10] dataset that has been applied as the benchmark for dense video captioning.
Dataset Splits	Yes	We follow the suggested protocol by [10, 11] to use 50% of the videos for training, 25% for validation, and 25% for testing.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	Yes	Our code is implemented by Pytorch-0.3.
Experiment Setup	Yes	The trade-off parameters in our loss, i.e., λs and λa are both set to 0.1. We train our model by using the stochastic gradient descent with the initial learning rate as 0.01 and momentum factor as 0.9.