Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
Authors: Yuanze Lin, Yujia Xie, Dongdong Chen, Yichong Xu, Chenguang Zhu, Lu Yuan
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments on the standard OK-VQA dataset and achieve new state-of-the-art performance, i.e., 58.0% accuracy, surpassing previous state-of-the-art method by a large margin (+3.6%). We also conduct detailed analysis and show the necessity of regional information in different framework components for knowledge-based VQA. |
| Researcher Affiliation | Collaboration | University of Washington Microsoft EMAIL EMAIL |
| Pseudocode | No | The paper describes the method using equations and text, but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is publicly available at https://github.com/yzleroy/REVIVE. |
| Open Datasets | Yes | OK-VQA dataset [22] is selected for evaluation, which is currently the largest knowledgebased VQA dataset. |
| Dataset Splits | No | The paper states 'The training and testing split consist of 9009 and 5046 samples respectively' but does not explicitly mention a validation split or its size. |
| Hardware Specification | Yes | We use 4 NVIDIA V100 32Gb to train models for 10K steps, with a batch size of 8. |
| Software Dependencies | No | The paper mentions specific pre-trained models like 'GLIP-T', 'Vinvl-Large', 'CLIP model (Vi T-B/16 variant)', 'T5 model', and 'GPT-3', but does not provide specific version numbers for the underlying software libraries or environments (e.g., PyTorch version, Python version). |
| Experiment Setup | Yes | We use 4 NVIDIA V100 32Gb to train models for 10K steps, with a batch size of 8. The learning rate is 8e 5 and Adam W [19] is chosen as optimizer. The warm-up steps are 1K and the trained models are evaluated every 500 steps. |