LIVE: Learnable In-Context Vector for Visual Question Answering

Authors: Yingzhe Peng, chenduo hao, Xinting Hu, Jiawei Peng, Xin Geng, Xu Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that LIVE can significantly reduce computational costs while enhancing accuracy in VQA tasks compared to traditional ICL and other non-learnable ICV methods.
Researcher Affiliation Academia 1 Southeast University 2 Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China {yingzhe.peng, 213201447, pengjiawei, xgeng, xuyang_palm}@seu.edu.cn 3 Nanyang Technological University xinting001@e.ntu.edu.sg
Pseudocode No The paper describes the method mathematically and shows a training pipeline diagram (Figure 2), but no explicit pseudocode block or algorithm listing.
Open Source Code Yes The code is available at https: //github.com/For Jade Forest/LIVE-Learnable-In-Context-Vector.
Open Datasets Yes We evaluate our approach using the IDEFICS-9B model [9] across two datasets: VQAv2 [47] and OKVQA [48].
Dataset Splits Yes For both VQAv2 and OKVQA datasets, We train our LIVE on 8, 000 pairs from each training set. Due to computational resource limitations, we randomly sample 10, 000 question-answer pairs from the VQAv2 validation split for evaluation [18]. For OKVQA, we utilize the entire validation split.
Hardware Specification Yes During the inference process, we utilize two Xeon Silver 3414 CPUs, one RTX 3090 GPU, and 384 GB of memory.
Software Dependencies No The paper mentions 'optimizer Adam W [52]' but does not specify version numbers for Adam W or other crucial software libraries such as Python, PyTorch, or CUDA.
Experiment Setup Yes Table 7: VQAv2 and OKVQA LIVE Training Parameters Hyperparameter VQAv2 OKVQA optimizer Adam W [52] Adam W learning rate of α 1e-2 1e-2 learning rate of V 1e-3 5e-3 λ 0.5 0.5 weight decay 1e-3 1e-3 precision FP16 FP16 batch size 2 2 warm up 0.1 0.1 accumulate batches 8 8 number of epochs 10 10