REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Authors: Yuanze Lin, Yujia Xie, Dongdong Chen, Yichong Xu, Chenguang Zhu, Lu Yuan
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments on the standard OK-VQA dataset and achieve new state-of-the-art performance, i.e., 58.0% accuracy, surpassing previous state-of-the-art method by a large margin (+3.6%). We also conduct detailed analysis and show the necessity of regional information in different framework components for knowledge-based VQA. |
| Researcher Affiliation | Collaboration | University of Washington Microsoft yuanze@uw.edu {yujiaxie, dochen, yicxu}@microsoft.com |
| Pseudocode | No | The paper describes the method using equations and text, but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is publicly available at https://github.com/yzleroy/REVIVE. |
| Open Datasets | Yes | OK-VQA dataset [22] is selected for evaluation, which is currently the largest knowledgebased VQA dataset. |
| Dataset Splits | No | The paper states 'The training and testing split consist of 9009 and 5046 samples respectively' but does not explicitly mention a validation split or its size. |
| Hardware Specification | Yes | We use 4 NVIDIA V100 32Gb to train models for 10K steps, with a batch size of 8. |
| Software Dependencies | No | The paper mentions specific pre-trained models like 'GLIP-T', 'Vinvl-Large', 'CLIP model (Vi T-B/16 variant)', 'T5 model', and 'GPT-3', but does not provide specific version numbers for the underlying software libraries or environments (e.g., PyTorch version, Python version). |
| Experiment Setup | Yes | We use 4 NVIDIA V100 32Gb to train models for 10K steps, with a batch size of 8. The learning rate is 8e 5 and Adam W [19] is chosen as optimizer. The warm-up steps are 1K and the trained models are evaluated every 500 steps. |