reproducibilityindex.ai

Feature Enhancement in Attention for Visual Question Answering

Authors: Yuetan Lin, Zhangyang Pang, Donghui Wang, Yueting Zhuang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on the largest VQA v2.0 benchmark dataset and achieve competitive results without additional training data, and prove the effectiveness of our proposed feature-enhanced attention by visual demonstrations.
Researcher Affiliation	Academia	Yuetan Lin, Zhangyang Pang, Donghui Wang , Yueting Zhuang College of Computer Science, Zhejiang University Hangzhou, P. R. China {linyuetan,pzy,dhwang,yzhuang}@zju.edu.cn
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the described methodology.
Open Datasets	Yes	We trained our model and conducted comparative experiments on VQA v2.0 [Goyal et al., 2017] dataset.
Dataset Splits	Yes	10 human-labeled answer annotations per question are provided for training and validation splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions various models and optimizers (e.g., GloVe, GRU, RMSprop, Faster R-CNN) but does not provide specific software dependency names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	The learning rate is initialized to be 3e4 and kept ﬁxed for the ﬁrst 40 epochs, and is decayed every 10 epochs with a decay factor. Difference. The main differences between two versions are the attention, dropout usage and learning rate changing manner. The base-att model uses dropout of 0.5 only after word embedding layer, before generating the attention weights and before generating answer, while the double-att uses dropout of 0.5 before every linear layer. The learning rate decay factor of two versions are 0.8 and 0.9, respectively.