Temporally Grounding Language Queries in Videos by Contextual Boundary-Aware Prediction

Authors: Jingwen Wang, Lin Ma, Wenhao Jiang12168-12175

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on three public datasets: TACo S (Regneri et al. 2013), Charades-STA (Gao et al. 2017), and Activity Net Captions (Krishna et al. 2017).Table 1: Performance comparison on TACo S (Regneri et al. 2013) dataset. All results are reported in percentage (%).Table 2: Ablation study on TACo S (Regneri et al. 2013) dataset. All results are reported in percentage (%).
Researcher Affiliation Industry Jingwen Wang, Lin Ma, Wenhao Jiang Tencent AI Lab {jaywongjaywong, forest.linma, cswhjiang}@gmail.com
Pseudocode No The paper describes its proposed method and training process in text and figures but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that the code is released or available.
Open Datasets Yes We conduct extensive experiments on three public datasets: TACo S (Regneri et al. 2013), Charades-STA (Gao et al. 2017), and Activity Net Captions (Krishna et al. 2017).
Dataset Splits Yes The same split as (Gao et al. 2017) is used, which includes 10146, 4589, 4083 query-segment pairs for training, validation and testing.The train/test split is 12408/3720.we merge the two validation subsets val1, val2 as our test split, as (Chen et al. 2018).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions several software components like C3D features, GloVe word embeddings, Match-LSTM, and LSTM, but does not provide specific version numbers for any of these or the underlying frameworks used for implementation.
Experiment Setup Yes We set hidden neuron size of LSTM to 512. We generally design the K anchors to cover at least 95% of training segments. Therefore, we empirically set K to 32, 20 and 100 for TACo S, Charades-STA and Activity Net Captions, respectively. The NMS thresholds are 0.3, 0.55 and 0.55, respectively.