reproducibilityindex.ai

Enhancing Domain Adaptation through Prompt Gradient Alignment

Authors: Viet Hoang Phan, Tung Lam Tran, Quyen Tran, Trung Le

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our method consistently surpasses other vision language model adaptation methods by a large margin on a wide range of benchmarks. The implementation is available at https://github.com/Viet Hoang1512/PGA. 1 Introduction... 5 Experiments
Researcher Affiliation	Collaboration	Hoang Phan New York University hvp2011@nyu.edu Lam Tran Vin AI Research lamtt12@vinai.io Quyen Tran Vin AI Research quyentt15@vinai.io Trung Le Monash University trunglm@monash.edu
Pseudocode	Yes	Algorithm 1 Prompt gradient alignment for unsupervised domain adaptation
Open Source Code	Yes	The implementation is available at https://github.com/Viet Hoang1512/PGA.
Open Datasets	Yes	Datasets. We conduct experiments using three well-established UDA datasets of varying scales: Image CLEF [17], Office-Home [86], and Domain Net [87], respectively. Detailed descriptions of these datasets are available in Appendix C.1.
Dataset Splits	No	The paper states it follows protocols of prior work: 'following the same protocol of recent prompt-based UDA studies [25, 28]', but does not explicitly state the train/validation/test splits within the paper for the main experiments.
Hardware Specification	Yes	All experiments are run on Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz and NVIDIA A100-SXM4-80GB GPU.
Software Dependencies	No	The paper mentions using ResNet50, ResNet101, pretrained-CLIP, mini-batch SGD optimizer, and a cosine learning rate scheduler, but does not provide specific version numbers for any software or libraries.
Experiment Setup	Yes	For fair comparisons, we use a Res Net50 as our backbone on Image-CLEF and Office-Home and a Res Net101 on Domain Net. Their weights are taken from pretrained-CLIP and kept frozen during training. Prompts are trained with the mini-batch SGD optimizer with a learning rate of 0.003 and 0.005. We use a batch size of 32 and adopt a cosine learning rate scheduler. For hyper-parameters, token lengths M1 and M2 are both set to 16. Pseudo-label threshold τ is set to 0.4 for producing reliable labels. ρgn, ρga and λ are found using grid-search. Details are provided in the public source code.