KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation

Authors: Haozhe Feng, Zhaoyang You, Minghao Chen, Tianye Zhang, Minfeng Zhu, Fei Wu, Chao Wu, Wei Chen

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The extensive experiments show that KD3A significantly outperforms state-of-the-art UMDA approaches. Moreover, the KD3A is robust to the negative transfer and brings a 100 reduction of communication cost compared with other decentralized UMDA methods.
Researcher Affiliation Academia 1State Key Lab of CAD&CG, Zhejiang University, Hangzhou, China 2College of Computer Science and Technology, Zhejiang University, Hangzhou, China 3School of Public Affairs, Zhejiang University, Hangzhou, China.
Pseudocode Yes Algorithm 1 KD3A training process with epoch t.
Open Source Code Yes In addition, our KD3A is easy to implement and we create an open-source framework to conduct KD3A on different benchmarks.
Open Datasets Yes We perform experiments on four benchmark datasets: (1) Amazon Review (Ben-David et al., 2006), (2) Digit-5 (Zhao et al., 2020), (3) Office-Caltech10 (Gong et al., 2012), (4) Domain Net (Peng et al., 2019)...
Dataset Splits No The paper describes using source domains for training and target domains for evaluation in a domain adaptation setting, but does not provide specific train/validation/test dataset splits (e.g., 80/10/10) for reproducibility within individual domains.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions using PyTorch and SGD optimizer, but does not specify exact version numbers for software dependencies such as PyTorch, Python, or CUDA.
Experiment Setup Yes For model optimization, We use the SGD with 0.9 momentum as the optimizer and take the cosine schedule to decay learning rate from high (i.e. 0.05 for Amazon Review and Digit5, and 0.005 for Office-Caltech10 and Domain Net) to zero. ... Confidence gate is the only hyper-parameter in KD3A, and should be treated carefully. ... Therefore, we gradually increase it from low (e.g., 0.8) to high (e.95) in training.