Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Chop Chop BERT: Visual Question Answering by Chopping VisualBERT’s Heads
Authors: Chenyu Gao, Qi Zhu, Peng Wang, Qi Wu
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As shown in the interesting echelon shape of the result matrices, experiments reveal different heads and layers are responsible for different question types, with higher-level layers activated by higher-level visual reasoning questions. Our experiments based on the Visual BERT, as for it s general Transformer style architecture without more extra designs. |
| Researcher Affiliation | Academia | 1School of Computer Science, Northwestern Polytechnical University, Xi an, China 2School of Software, Northwestern Polytechnical University, Xi an, China 3National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, China 4University of Adelaide, Australia |
| Pseudocode | No | The paper describes its methods but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | All of our experiments are conducted on the Task Driven Image Understanding Challenge (TDIUC) [Kafle and Kanan, 2017a] dataset, a large VQA dataset. This dataset was proposed to compensate for the bias in distribution of different question types of VQA 2.0 [Goyal et al., 2017]. |
| Dataset Splits | No | The paper mentions using the TDIUC dataset and fine-tuning but does not explicitly provide details about training, validation, and test dataset splits with percentages or counts. |
| Hardware Specification | Yes | Experiments are conducted on 4 NVIDIA Ge Force 2080Ti GPUs with a batch size of 480. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | We load the model pre-trained on COCO Caption [Chen et al., 2015] dataset, then finetune it with a leaning rate of 5e 5 on the TDIUC dataset. The maximal learning rate is 1e 3 and the batch size is 480. |