MomentDiff: Generative Video Moment Retrieval from Random to Real

Authors: Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results demonstrate that our efficient framework consistently outperforms state-of-the-art methods on three public benchmarks, and exhibits better generalization and robustness on the proposed anti-bias datasets. We evaluate the efficacy of our model by conducting experiments on three representative datasets: Charades-STA [23], QVHighlights [37] and TACo S [74].
Researcher Affiliation Collaboration Pandeng Li1 , Chen-Wei Xie2, Hongtao Xie1 , Liming Zhao2, Lei Zhang1, Yun Zheng2, Deli Zhao2, Yongdong Zhang1 1 University of Science and Technology of China, Hefei, China 2 Alibaba Group
Pseudocode Yes Algorithm 1: Moment Diff Training in a Py Torch-like style. ... Algorithm 2: Moment Diff inference in a Py Torch-like style.
Open Source Code Yes The code, model, and anti-bias evaluation datasets are available at https://github. com/IMCCretrieval/Moment Diff.
Open Datasets Yes We evaluate the efficacy of our model by conducting experiments on three representative datasets: Charades-STA [23], QVHighlights [37] and TACo S [74]. Public datasets. Charades-STA [23] serves as a benchmark dataset... QVHighlights [37] contains... TACo S [74] is compiled...
Dataset Splits Yes The training and testing divisions are consistent with existing methods [28, 38]. ... The training set, validation set and test set include 7,218, 1,550 and 1,542 video-text pairs, respectively. ... We use the same dataset split [31], which consists of 10,146, 4,589, and 4,083 video-query pairs for the training, validation, and testing sets, respectively.
Hardware Specification Yes For all datasets, we optimize Moment Diff for 100 epochs on one NVIDIA Tesla A100 GPU, employ Adam optimizer [77] with 1e-4 weight decay and fix the batch size as 32.
Software Dependencies No The paper mentions 'Pytorch framework [84]' but does not specify a version number for PyTorch or any other software dependency.
Experiment Setup Yes We set the hidden size D = 256 in all Transformer layers. ... The number of random spans Nr is set to 10 for QVHighlights, 5 for Charades-STA and TACo S. ... we optimize Moment Diff for 100 epochs on one NVIDIA Tesla A100 GPU, employ Adam optimizer [77] with 1e-4 weight decay and fix the batch size as 32. The learning rate is set to 1e-4. By default, the loss hyperparameters λL1 = 10, λiou = 1 and λce = 4. The weight values for Lsim and Lvmr are 4 and 1.