Declaration-based Prompt Tuning for Visual Question Answering
Authors: Yuhang Liu, Wei Wei, Daowan Peng, Feida Zhu
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on GQA dataset show that DPT outperforms the fine-tuned counterpart by a large margin regarding accuracy in both fully-supervised (2.68%) and zero-shot/few-shot (over 31%) settings. |
| Researcher Affiliation | Collaboration | Yuhang Liu1,2 , Wei Wei1,2 , Daowan Peng1,2 and Feida Zhu3 1Cognitive Computing and Intelligent Information Processing (CCIIP) Laboratory, School of Computer Science and Technology, Huazhong University of Science and Technology, China 2Joint Laboratory of HUST and Pingan Property & Casualty Research (HPL), China 3School of Computing and Information Systems, Singapore Management University, Singapore |
| Pseudocode | No | The paper describes methods through textual descriptions and equations (e.g., Equation 1-14) and includes a framework diagram (Figure 2), but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code. |
| Open Source Code | No | All the data and codes will be available to facilitate future research. |
| Open Datasets | Yes | Datasets. GQA [Hudson and Manning, 2019a] and VQA v2.0 [Agrawal et al., 2015] are used to build declaration generation dataset and evaluate our proposed methods on VQA task. More details are provided in the Appendix. |
| Dataset Splits | Yes | For a deeper understanding of DPT, we further conduct the ablation studies on the local validation split of GQA and VQA v2.0 datasets (textdev on GQA and val on VQA v2.0). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models, memory, or other system specifications. |
| Software Dependencies | No | The paper mentions models and architectures used (e.g., 'T5-small', 'Vin VL'), but it does not specify versions for general software dependencies or libraries such as Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | The number of answers used for ITM K is set to 8. For fair comparison, we follow the same training settings as reported in the previous works in the following experiments. The details of hyper-parameters are reported in Appendix. |