reproducibilityindex.ai

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

Authors: Zhihang Liu, Jun Li, Hongtao Xie, Pandeng Li, Jiannan Ge, Sun-Ao Liu, Guoqing Jin

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three widely used benchmarks, including the out-of-distribution settings, show that the proposed framework achieves a new start-of-the-art performance with notable generalization ability (e.g., 4.42% and 7.69% average gains of R1@0.7 on Charades-STA and Charades-CG). The code will be available at https://github.com/lntzm/MESM.
Researcher Affiliation	Collaboration	1 University of Science and Technology of China, Hefei, China 2 People s Daily Online
Pseudocode	No	The paper describes the methods in text and uses figures (Figure 2, Figure 3) to illustrate the pipeline, but it does not include any formal pseudocode or algorithm blocks.
Open Source Code	Yes	The code will be available at https://github.com/lntzm/MESM.
Open Datasets	Yes	We evaluate the proposed method on three widely used datasets, which are Charades-STA (Gao et al. 2017), TACo S(Regneri et al. 2013), and QVHighlights (Lei, Berg, and Bansal 2021). We also experiment on Charades-CG (Li et al. 2022a), which proposes out-of-distribution (OOD) settings for Charades-STA.
Dataset Splits	Yes	We evaluate the proposed method on three widely used datasets, which are Charades-STA (Gao et al. 2017), TACo S(Regneri et al. 2013), and QVHighlights (Lei, Berg, and Bansal 2021). We also experiment on Charades-CG (Li et al. 2022a), which proposes out-of-distribution (OOD) settings for Charades-STA.
Hardware Specification	Yes	We build our model upon QD-DETR (Moon et al. 2023) with some optimizations, and train our model with Adam optimizer (Kingma and Ba 2014) on a single NVIDIA RTX 3090.
Software Dependencies	No	The paper mentions building upon 'QD-DETR (Moon et al. 2023)' and using 'Adam optimizer (Kingma and Ba 2014)', but it does not specify version numbers for these or other software libraries (e.g., PyTorch, TensorFlow, Python version) used in the experiments.
Experiment Setup	Yes	We set γ as 0.9, the hidden dimension of the transformer layers as 256, the layers of FW-MESM, MA, transformer encoder, and decoder as 2.