An Empirical Study on the Language Modal in Visual Question Answering

Authors: Daowan Peng, Wei Wei, Xian-Ling Mao, Yuanyuan Fu, Dangyang Chen

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This paper attempts to provide new insights into the influence of language modality on VQA performance from an empirical study perspective. To achieve this, we conducted a series of experiments on six models. The results of these experiments revealed that...
Researcher Affiliation Collaboration 1Cognitive Computing and Intelligent Information Processing (CCIIP) Laboratory, School of Computer Science and Technology, Huazhong University of Science and Technology 2Joint Laboratory of HUST and Pingan Property & Casualty Research (HPL) 3Department of Computer Science and Technology, Beijing Institute of Technology 4Ping An Property & Casualty Insurance Company of China, Ltd
Pseudocode No No pseudocode or algorithm blocks are explicitly presented in the paper.
Open Source Code No The paper does not contain any explicit statement about releasing its own source code or provide a link to a code repository.
Open Datasets Yes Dataset: We selected the widely-used VQAv2 benchmark [Goyal et al., 2017] and its OOD benchmark, VQA-CPv2 [Agrawal et al., 2018].
Dataset Splits Yes The results on the VQAv2 validation split demonstrate that all models experience varying degrees of performance degradation when evaluated on variant questions.
Hardware Specification No No specific hardware (e.g., GPU models, CPU types, or cloud instances with specs) used for running experiments is mentioned in the paper.
Software Dependencies No The paper mentions software components like BERT, LSTM, or GRU, but does not specify their version numbers or other ancillary software dependencies required for replication.
Experiment Setup Yes Regarding the implementation details in the training process, we adhere to the experimental settings of the open-source codes and do not modify other parameters such as learning rate, batch size, or optimizer.