Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Cross-Modal Feature Distribution Calibration for Few-Shot Visual Question Answering

Authors: Jing Zhang, Xiaoqiang Liu, Mingzhe Chen, Zhe Wang

AAAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our proposed CDCIN achieves excellent performance on fewshot VQA and outperforms state-of-the-art methods on three widely used benchmark datasets. To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies.
Researcher Affiliation	Academia	Department of Computer Science and Engineering, East China University of Science and Technology, China
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code.
Open Datasets	Yes	To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies.
Dataset Splits	Yes	Finally, we randomly select 60% samples of the final set as the training set, 20% as the valid set, and the rest as the test set.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'pre-trained Gol Ve' and 'Swin-Transformer' but does not specify software dependencies with version numbers required for reproduction.
Experiment Setup	No	The paper refers to 'supplementary materials' for 'details of datasets and implementation' but does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs) or optimizer settings in the main text.