Cross-Modal Feature Distribution Calibration for Few-Shot Visual Question Answering

Authors: Jing Zhang, Xiaoqiang Liu, Mingzhe Chen, Zhe Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our proposed CDCIN achieves excellent performance on fewshot VQA and outperforms state-of-the-art methods on three widely used benchmark datasets. To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies.
Researcher Affiliation Academia Department of Computer Science and Engineering, East China University of Science and Technology, China
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found in the paper.
Open Source Code No The paper does not provide an explicit statement or link for open-source code.
Open Datasets Yes To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies.
Dataset Splits Yes Finally, we randomly select 60% samples of the final set as the training set, 20% as the valid set, and the rest as the test set.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'pre-trained Gol Ve' and 'Swin-Transformer' but does not specify software dependencies with version numbers required for reproduction.
Experiment Setup No The paper refers to 'supplementary materials' for 'details of datasets and implementation' but does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs) or optimizer settings in the main text.