AnyDA: Anytime Domain Adaptation
Authors: Omprakash Chakraborty, Aadarsh Sahoo, Rameswar Panda, Abir Das
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments on 4 benchmark datasets and demonstrate that Any DA achieves superior performance over the state-of-the-art methods, more significantly at lower computation budgets. We also include comprehensive ablation studies to depict the importance of each module of our proposed framework. |
| Researcher Affiliation | Collaboration | 1IIT Kharagpur, 2MIT-IBM Watson AI Lab {opckgp@,abir@cse.}iitkgp.ac.in, {aadarsh,rpanda}@ibm.com |
| Pseudocode | Yes | Algorithm 1 The training pseudocode for Any DA is shown in Algorithm 1. |
| Open Source Code | Yes | Project page: https://cvir.github.io/projects/anyda |
| Open Datasets | Yes | The dataset is publicly available to download at: https://people.eecs.berkeley.edu/ jhoffman/domainadapt/#datasets_code. (Office-31) |
| Dataset Splits | Yes | In addition, following the general practice we use a validation set to obtain best hyperparameters. |
| Hardware Specification | Yes | All the experiments were performed using 4 NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using Res Net-50 and Mobile Net V3 as network architectures, and DANN as a domain adaptation method, but does not specify software dependencies like Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | For Eqn. 1, we use τstu = 0.1 and τtea = 0.04. We use a momentum hyperparameter value of λ = 0.96. In Eqn. 3, a threshold value of τpl = 0.9 was used for Office31 and Office-Home dataset, while τpl = 0.4 for the Domain Net dataset. While in Eqn. 4, we use λcls = 1, 15, 64, λrd = 1, 1, 0.5 for Office-31, Office-Home and Domain Net, respectively, λpl = 0.1 for all the datasets. We perform warm-up using source data for 100, 100, 30 epochs for Office-31, Office-Home and Domain Net, respectively. The proposed approach Any DA is trained for 30, 100, 20 epochs, respectively. We use a per-gpu batch size of 64 (32 source + 32 target) for all the experiments. We use a learning rate of 2e-4 for Office-31 and Office-Home, while 3e-5 for Domain Net. We follow cosine annealing to update the learning rate. |