AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

Authors: Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto Del Bimbo

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our proposed adapted margin cosine loss can enhance the baseline models with an absolute performance gain of 15% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning. We apply this loss function to several baseline models and evaluate its effectiveness on two VQA-CP benchmarks. Experimental results demonstrate that our proposed adapted margin cosine loss can enhance the baseline models with an absolute performance gain of 15% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.
Researcher Affiliation Collaboration Yangyang Guo1 , Liqiang Nie1 , Zhiyong Cheng2 , Feng Ji3 , Ji Zhang3 , Alberto Del Bimbo4 1Shandong University 2Shandong Artificial Intelligence Institute 3Alibaba Group 4University of Florence
Pseudocode No The paper includes mathematical formulations and a training pipeline diagram (Figure 2), but it does not contain any clearly labeled pseudocode blocks or algorithm listings.
Open Source Code Yes The code is released for the re-implementation of this work2. 2https://github.com/guoyang9/Ada VQA.
Open Datasets Yes To validate the effectiveness of the proposed loss function, we conducted extensive experiments on the two VQA-CP datastes: VQA-CP v2 and VQA-CP v1 [Agrawal et al., 2018], which are two public benchmarks for estimating the models capability of overcoming the language prior problem in VQA.
Dataset Splits Yes To validate the effectiveness of the proposed loss function, we conducted extensive experiments on the two VQA-CP datastes: VQA-CP v2 and VQA-CP v1 [Agrawal et al., 2018], which are two public benchmarks for estimating the models capability of overcoming the language prior problem in VQA. Notably, [Agrawal et al., 2018] curated a diagnostic dataset VQA-CP, wherein the answer distributions of per question type are significantly distinct between the train and test sets.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, or memory) used for running the experiments. It only mentions using pre-trained convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in the context of the model pipeline, not the hardware.
Software Dependencies No The paper does not mention any specific software frameworks (e.g., TensorFlow, PyTorch) or their version numbers, nor does it list any other software dependencies with version information.
Experiment Setup Yes Different from most prior methods overcoming the language prior problem in VQA, for all the three baselines, we simply replaced the original loss with our Ada VQA and did NOT change any other settings, such as embedding size, learning rate, optimizer, and batch size. We investigated the different influence of the question type entropy and scales in Fig. 4.