reproducibilityindex.ai

CoSTA: End-to-End Comprehensive Space-Time Entanglement for Spatio-Temporal Video Grounding

Authors: Yaoyuan Liang, Xiao Liang, Yansong Tang, Zhao Yang, Ziran Li, Jingang Wang, Wenbo Ding, Shao-Lun Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on the challenging benchmarks of HC-STVG and Vid STG, where Co STA outperforms existing state-of-the-art methods, demonstrating its effectiveness for this task. Experiments Datasets and Metrics
Researcher Affiliation	Collaboration	1Shenzhen Key Laboratory of Ubiquitous Data Enabling, Tsinghua Shenzhen International Graduate School, Tsinghua University 2University of Oxford, 3Meituan Inc.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code for the described methodology or a link to a code repository.
Open Datasets	Yes	We evaluate our proposed method on two mainstream benchmarks HC-STVG (Tang et al. 2021) and Vid STG (Zhang et al. 2020c)
Dataset Splits	Yes	HC-STVG dataset ... is divided into training and test subsets with 4,500 and 1,160 video-sentence pairs. This dataset is extended to HC-STVG V2 ..., which contains 10,131 and 3,482 videos in training and validation subsets, respectively. Vid STG dataset ... are divided into training, validation and test subsets with 80,684, 8,956 and 10,303 distinct sentence-tube pairs...
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using RoBERTa (Liu et al. 2019c) but does not provide specific version numbers for it or any other software libraries or frameworks used in the implementation.
Experiment Setup	No	The paper mentions a sampling mechanism with ratio β [0, 1] and balancing weights λs for the loss function, and discusses 'faster convergence'. However, it does not provide specific values for hyperparameters such as learning rate, batch size, number of epochs, or detailed optimizer settings.