ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization
Authors: Qishi Dong, Awais Muhammad, Fengwei Zhou, Chuanlong Xie, Tianyang Hu, Yongxin Yang, Sung-Ho Bae, Zhenguo Li
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our paradigm on a diverse model zoo consisting of 35 models for various Oo D tasks and demonstrate: (i) model ranking is better correlated with fine-tuning ranking than previous methods and up to 9859x faster than brute-force fine-tuning; (ii) Oo D generalization after model ensemble with feature selection outperforms the state-of-the-art methods and the accuracy on most challenging task Domain Net is improved from 46.5% to 50.6%. In this section, we demonstrate the effectiveness of Zoo D. First, we evaluate the ability of our ranking metric to estimate Oo D performance and compare it with the ground-truth performance and several existing IID ranking methods. Second, we show that our aggregation method achieves significant improvements and SOTA results on several Oo D datasets. |
| Researcher Affiliation | Collaboration | Qishi Dong 2,1 , Awais Muhammad 3,1 , Fengwei Zhou 1 , Chuanlong Xie 4,1 , Tianyang Hu 1, Yongxin Yang 1, Sung-Ho Bae 3, Zhenguo Li 1 1 Huawei Noah s Ark Lab, 2 Hong Kong Baptist University, 3 Kyung-Hee University, 4 Beijing Normal University |
| Pseudocode | Yes | Algorithm 1 Pseudocode of Variational EM Algorithm for Bayesian Feature Selection |
| Open Source Code | No | Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/zood. Will be released upon publication. |
| Open Datasets | Yes | We conduct experiments on six Oo D datasets: PACS [43], VLCS [24], Office-Home [77], Terra Incognita [10], Domain Net [63], and NICO (NICO-Animals & NICO-Vehicles) [31]. |
| Dataset Splits | Yes | The standard way to conduct the experiment is to choose one domain as test (unseen) domain and use the remaining domains as training domains, which is named leave-one-domain-out protocol. We adopt the leave-one-domain-out cross-validation setup in Domain Bed with 10 experiments for hyper-parameter selection and run 3 trials. |
| Hardware Specification | No | The paper mentions 'GPU hours' and 'GPU years' in Section 4.3 and Table 1c, and states it includes 'the type of resources used' in the checklist, but it does not specify concrete hardware details such as specific GPU models (e.g., V100, A100), CPU models, or cloud providers. |
| Software Dependencies | No | The paper mentions software like 'Mind Spore' and references 'Pytorch' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We adopt the leave-one-domain-out cross-validation setup in Domain Bed with 10 experiments for hyper-parameter selection and run 3 trials. We triple the number of iterations for Domain Net (5000 to 15000) as it is a large-scale dataset requiring more iterations [17] and decrease the number of experiments for hyper-parameter selection from 10 to 5. |