Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Dense Events Grounding in Video
Authors: Peijun Bao, Qian Zheng, Yadong Mu920-928
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on large-scale datasets Activity Net Captions and TACo S. |
| Researcher Affiliation | Academia | Peijun Bao,1 Qian Zheng,2 Yadong Mu1* 1Peking University, China 2Nanyang Technological University, Singapore |
| Pseudocode | No | The paper describes the network architecture and components in text and diagrams but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | Activity Net Captions Activity Net Captions (Krishna et al. 2017) consists of 19,209 untrimmed videos. ... TACo S TACOS (Regneri et al. 2013) consists of 127 videos. |
| Dataset Splits | Yes | For a fair comparison, following the experimental setting in single sentence grounding (Zhang et al. 2020; Yuan et al. 2019), we use val 1 as validation set and val 2 as testing set. There are 37,417, 17,505, and 17,031 moment-sentence pairs in the training, validation and testing set, respectively. ... Following the standard data splitting, there are totally 10,146, 4,589 and 4,083 moment-sentence pairs in the training, validation and testing set, respectively. |
| Hardware Specification | No | The paper mentions using 'pretrained CNN (Tran et al. 2015)' for feature extraction but does not specify any hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running its own experiments. |
| Software Dependencies | No | The paper mentions using 'Glove word embedding', 'LSTM', and 'Adam' but does not specify any software versions for libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages (e.g., Python). |
| Experiment Setup | Yes | During training, We use Adam (Kingma and Ba 2014) with learning rate of 1 10 4, the momentum of 0.9 and batch size of 4 as optimization algorithm. ... The channel numbers of sentence feature and video proposal feature d S, d V are all set to 512 . We set the dimension of positional feature dpos to 128 and the size of compact set n to 512. The number of sampled clips N is set to 32, 64 for Activity Net Captions and TACo S respectively. For BM operations in the video encoder, we set sampling number of each proposal to 16, 32 for Activity Net Captions and TACo S respectively. ... For binary cross entropy loss, the scaling thresholds µmin and µmax are set to 0.5 and 1.0 for Activity Net Captions and 0.3 and 0.7 for TACo S. |