reproducibilityindex.ai

Explicit Knowledge-based Reasoning for Visual Question Answering

Authors: Peng Wang, Qi Wu, Chunhua Shen, Anthony Dick, Anton van den Hengel

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first evaluate different approaches automatically using simple string matching and Wu-Palmer similarity (WUPS) [Malinowski and Fritz, 2014a], in which the human answers are considered as ground truth.
Researcher Affiliation	Academia	Peng Wang 1,2, Qi Wu 3, Chunhua Shen2,3, Anthony Dick2, Anton van den Hengel2,3 1Northwestern Polytechnical University, China 2The University of Adelaide, Australia, 3Australian Centre for Robotic Vision
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper provides a link for its dataset: 'https://bitbucket.org/sxjzwq1987/kb-vqa-dataset', but does not explicitly state that the source code for the proposed methodology is available.
Open Datasets	Yes	We select 700 of the validation images from the MS COCO [Lin et al., 2014] dataset... The LSTM is trained on the training set of VQA data [Antol et al., 2015]4.
Dataset Splits	No	The paper mentions selecting '700 of the validation images from the MS COCO' to create their dataset, but does not specify a validation split for their own KB-VQA dataset or for the training of their primary model (Ahab). The LSTM baseline only refers to 'training set' and 'test set'.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions software like Quepy and NLTK ('Quepy begins by tagging each word in the question using NLTK [Bird et al., 2009]'), but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Speciﬁcally, we use the second fully-connected layer (4096-d) of a pre-trained VGG model as the image features, and the LSTM is trained on the training set of VQA data [Antol et al., 2015]4. The LSTM layer contains 512 memory cells in each unit.