GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph

Authors: Xin Li, Dongze Lian, Zhihe Lu, Jiawang Bai, Zhibo Chen, Xinchao Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on 11 benchmark datasets reveal that our Graph Adapter significantly outperforms previous adapterbased methods.
Researcher Affiliation Academia 1 University of Science and Technology of China 2 National University of Singapore 3 Tsinghua University
Pseudocode No The paper describes the method using mathematical equations and prose but does not provide pseudocode or an algorithm block.
Open Source Code Yes The code will be released at https://github.com/lixinustc/ Graph Adapter
Open Datasets Yes Following previous adapter-style studies [18, 82, 86], we validate our Graph Adapter on 11 few-shot classification tasks, including Image Net [12], Standford Cars [30], UCF101 [63], Caltech101 [17], Flowers102 [51], SUN397 [75], DTD [11], Euro SAT [21], FGVCAircraft [48], Oxford Pets [53], and Food101 [3].
Dataset Splits No The paper mentions training on 'few-shot training samples' and evaluating on '11 few-shot classification tasks' and 'Image Net V2, -Sketch, -A, and -R' for generalization, but it does not specify explicit train/validation/test dataset splits with percentages or absolute counts for the full datasets.
Hardware Specification Yes The training and inference are implemented with a single NVIDIA Ge Force RTX 3090.
Software Dependencies No The paper mentions using the Adam optimizer and models like CLIP, ResNet-50, and ViT, but it does not specify version numbers for general software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes We optimize our model for 100 epochs for 1, 2, 4, 8, and 16-shots. In the training process, we utilize the Adam optimizer with an initial learning rate of 1e 3, which drops with the cosine learning rate decay schedule. Notably, to achieve stable training, we follow previous works [82, 89] and utilize the warmup strategy for the training, where the small learning rate 1e 5 is applied at the first epoch. The data augmentation strategy only contains "random resized cropping" and "random flipping".