reproducibilityindex.ai

Leveraging Video Descriptions to Learn Video Question Answering

Authors: Kuo-Hao Zeng, Tseng-Hung Chen, Ching-Yao Chuang, Yuan-Hong Liao, Juan Carlos Niebles, Min Sun

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate performance on manually generated video-based QA pairs. The results show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.
Researcher Affiliation	Academia	Department of Electrical Engineering, National Tsing Hua University Department of Computer Science, Stanford University
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	Yes	Available at http://aliensunmin.github.io/project/videolanguage/
Open Datasets	Yes	We start by crawling an online curated video repository (http://jukinmedia.com/videos) to collect videos with high-quality descriptions. [...] 1Available at http://aliensunmin.github.io/project/videolanguage/
Dataset Splits	Yes	We use 14100 videos and 151263 candidate QA pairs for training, 2000 videos and 21352 candidate QA pairs for validation, and 2000 videos and 2461 ground truth QA pairs for testing.
Hardware Specification	No	Only a general reference to 'GPU memory limit' was found. No specific GPU models, CPU models, or other detailed hardware specifications for running experiments were provided.
Software Dependencies	No	The paper mentions 'Tensor Flow (et al. 2015)' but does not provide specific version numbers for TensorFlow or any other software libraries or tools.
Experiment Setup	Yes	We implement and train all the extended methods using Tensor Flow (et al. 2015) with the batch size of 100 and selected the ﬁnal model according to the best validation accuracy. Other model-speciﬁc training details are described below. E-MN. We use stochastic gradient descent with an initial learning rate of 0.001 [...] Inspired by several memory based models, we set 500 as the number of memories and the LSTM hidden dimension. [...] E-SA. We use the training settings as in (Yao et al. 2015), except for Adam optimization (Kingma and Ba 2015) with initial learning rate of 0.0001. E-SS. [...] We use Adam optimizer (Kingma and Ba 2015) with an initial learning rate of 0.0001. [...] at the ﬁrst iteration of self-paced learning, we set γ to remove 10% QA pairs with small loss ratio in the training data.