Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Authors: Yitian Yuan, Lin Ma, Jingwen Wang, Wei Liu, Wenwu Zhu
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three public datasets demonstrate that our proposed model outperforms the state-of-the-arts with clear margins, illustrating the ability of SCDM to better associate and localize relevant video contents for temporal sentence grounding. |
| Researcher Affiliation | Collaboration | Yitian Yuan Tsinghua-Berkeley Shenzhen Institute Tsinghua University EMAIL Lin Ma Tencent AI Lab EMAIL Jingwen Wang Tencent AI Lab EMAIL Wei Liu Tencent AI Lab EMAIL Wenwu Zhu Tsinghua University EMAIL |
| Pseudocode | No | The paper describes its method using mathematical equations and descriptions but does not provide structured pseudocode or an algorithm block. |
| Open Source Code | Yes | Our code for this paper is available at https://github.com/yytzsy/SCDM . |
| Open Datasets | Yes | We validate the performance of our proposed model on three public datasets for the TSG task: TACoS [24], Charades-STA [10], and Activity Net Captions [17]. |
| Dataset Splits | No | The paper mentions 'training' and 'testing' but does not explicitly detail the partitioning of datasets into distinct training, validation, and test splits with specific percentages or sample counts. |
| Hardware Specification | Yes | The methods with released codes are run with one Nvidia TITAN XP GPU. |
| Software Dependencies | No | The paper mentions using specific features and models (C3D, I3D, GloVe, Bi-directional GRU) but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | For the design of temporal convolutional layers, 6 layers with {32, 16, 8, 4, 2, 1} temporal dimensions, 6 layers with {512, 256, 128, 64, 32, 16} temporal dimensions, and 8 layers with {512, 256, 128, 64, 32, 16, 8, 4} temporal dimensions are set for Charades-STA, TACoS, and Activity Net Captions, respectively. ... Hidden dimension of the sentence Bi-directional GRU, dimension of the multimodal fused features df, and the filter number dh for temporal convolution operations are all set as 512 in this paper. The trade-off parameters of the two loss terms λ and η are set as 100 and 10, respectively. |