Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models

Authors: Yicheng Xu, Yuxin Chen, Jiahao Nie, Yusong Wang, HUIPING ZHUANG, Manabu Okumura

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results affirm RAIL s state-of-the-art performance in both X-TAIL and existing Multi-domain Task-Incremental Learning settings.
Researcher Affiliation Academia Yicheng Xu1 Institute of Science Tokyo 2University of California Berkeley 3Nanyang Technological University 4South China University of Technology 5Greater Bay Area Institute for Innovation, Hunan University, China
Pseudocode Yes The pseudo-codes of both training and testing algorithms are provided in Appendix A.
Open Source Code Yes The code is released at https://github.com/linghan1997/ Regression-based-Analytic-Incremental-Learning.
Open Datasets Yes we select 10 different image-classification datasets from different domains for our setting: Aircraft [24], Caltech101 [25], DTD [26], Euro SAT [27], Flowers [28], Food [29], MNIST [30], Oxford Pet [31], Stanford Cars [32], and SUN397 [33].
Dataset Splits Yes The optimal values are determined by minimizing the regression error on the validation set of the first domain, without access to future domains.
Hardware Specification Yes All the results are conducted on Ubuntu 20.04 with Intel Core i9-13900K CPU with a single RTX 4090Ti GPU by the average of 3 runs.
Software Dependencies No The paper mentions using a pre-trained CLIP model and an Ubuntu operating system, but does not provide specific version numbers for software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We conduct a grid search for the regularization parameter λ over the range 10 6, 10 5, ..., 1 and the RBF kernel bandwidth over the range 10 6, 10 5, ..., 10.