reproducibilityindex.ai

Curriculum Multi-Negative Augmentation for Debiased Video Grounding

Authors: Xiaohan Lan, Yitian Yuan, Hong Chen, Xin Wang, Zequn Jie, Lin Ma, Zhi Wang, Wenwu Zhu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on newly collected Charades-CD and Activity Net-CD datasets demonstrate our proposed strategy can improve the performance of the base model on both i.i.d and o.o.d scenarios.
Researcher Affiliation	Collaboration	1Tsinghua University 2Meituan Inc.
Pseudocode	Yes	Algorithm 1: Multi-stage Curriculum Process
Open Source Code	Yes	1Our codes are available at https://github.com/rubylan/Curri Multi NA
Open Datasets	Yes	To prove the effectiveness of our method, we conduct experiments on the newly collected Charades-CD and Activity Net-CD datasets (Yuan et al. 2021).
Dataset Splits	Yes	The numbers of videos in train/val/test-iid/testood splits are 4, 564/333/333/1, 442, and the numbers of video-query pairs are 11, 071/859/823/3, 375 respectively.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using I3D, C3D, and GloVe for feature extraction and encoding, but does not specify version numbers for these or any other software dependencies required to replicate the experiments.
Experiment Setup	Yes	As for the training strategy setting, we trained 30/20 (i.e., Tmax) epochs for Charades-CD/Activity Net-CD and report results of the epoch whose test-iid set performs the best with metric R@1,Io U=0.7. The batch sizes and learning rates were set to 64/32 and 0.0005/0.0001, respectively. λ{1,2,3} in L were all set to 5.0 for Charades-CD, and set to 15.0 for Activity Net-CD. We adaptively trained the model with the multi-stage curriculum process and set training stage update time T1, T2 and T3 to 3/7/18 and 2/5/13, respectively. As for the model architecture setting, to implement the Multi-NA strategy, we set the mask ratio α to 0.55 and the numbers of per-sample generated samples for each NA type (i.e., N {cc,vc,ss}) to 1 on both datasets.