reproducibilityindex.ai

Differential Networks for Visual Question Answering

Authors: Chenfei Wu, Jinlai Liu, Xiaojie Wang, Ruifan Li8997-9004

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We achieve state-of-the-art results on four publicly available datasets. Ablation studies also show the effectiveness of difference operations in DF model.
Researcher Affiliation	Academia	Center for Intelligence Science and Technology Beijing University of Posts and Telecommunications {wuchenfei, liujinlai, xjwang, rﬂi}@bupt.edu.cn
Pseudocode	No	The paper describes methodologies through equations and textual descriptions but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	More details, including source codes, will be published in the near future.
Open Datasets	Yes	We evaluate our model on four public datasets: the VQA 1.0 dataset (Antol et al. 2015), the VQA 2.0 dataset (Goyal et al. 2017), the COCO-QA dataset (Ren, Kiros, and Zemel 2015), and the TDIUC dataset (Kaﬂe and Kanan 2017a).
Dataset Splits	Yes	The VQA 1.0 dataset contains a total of 614,163 samples and is divided into three splits: train(40.4%), val(19.8%), test(39.8%).
Hardware Specification	No	The paper mentions implementation in Pytorch and training parameters, but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	The paper states 'We implement the model using Pytorch' but does not provide specific version numbers for Pytorch or any other software dependencies.
Experiment Setup	Yes	During the data embedding phase, the image features are mapped to the size of 36 2048 and the text features are mapped to the size of 2400. In the differential fusion phase, the number of hidden layer in DF is 510; hyperparameter S is 1, R is 5. The attention hidden unit number is 620. In the decision making phase, the number of hidden layer in DF is 510. All the nonlinear layers of the model all use the relu activation function and dropout (Srivastava et al. 2014) to prevent overﬁtting. All settings are commonly used in previous work. We implement the model using Pytorch. We use Adam (Kingma and Ba 2014) to train the model with a learning rate of 10 4 and a batch size of 128.