Cross-Modal Feature Distribution Calibration for Few-Shot Visual Question Answering
Authors: Jing Zhang, Xiaoqiang Liu, Mingzhe Chen, Zhe Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our proposed CDCIN achieves excellent performance on fewshot VQA and outperforms state-of-the-art methods on three widely used benchmark datasets. To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies. |
| Researcher Affiliation | Academia | Department of Computer Science and Engineering, East China University of Science and Technology, China |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code. |
| Open Datasets | Yes | To evaluate the effectiveness of CDCIN for few-shot VQA, we conduct a series of experiments on datasets based on widely used Toronto COCO-QA (Ren, Kiros, and Zemel 2015), Visual Genome-QA (Krishna et al. 2017) and VQA v2 (Goyal et al. 2017), including the quantitative analysis, qualitative analysis, ablation studies. |
| Dataset Splits | Yes | Finally, we randomly select 60% samples of the final set as the training set, 20% as the valid set, and the rest as the test set. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'pre-trained Gol Ve' and 'Swin-Transformer' but does not specify software dependencies with version numbers required for reproduction. |
| Experiment Setup | No | The paper refers to 'supplementary materials' for 'details of datasets and implementation' but does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs) or optimizer settings in the main text. |