Can We Evaluate Domain Adaptation Models Without Target-Domain Labels?

Authors: Jianfei Yang, Hanjie Qian, Yuecong Xu, Kai Wang, Lihua Xie

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the effectiveness of our metric through extensive empirical studies on UDA datasets of different scales and imbalanced distributions.
Researcher Affiliation Academia Jianfei Yang1 , Hanjie Qian1 , Yuecong Xu2, Kai Wang2, Lihua Xie1. 1Nanyang Technological University 2National University of Singapore
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Codes are available at https://sleepyseal.github.io/Transfer Score Web/
Open Datasets Yes We employ four datasets in our studies for different purposes. Office-31 (Saenko et al., 2010) is the most common benchmark for UDA including three domains (Amazon, Webcam, DSLR) in 31 categories. Office-Home (Venkateswara et al., 2017) is composed of four domains (Art, Clipart, Product, Real World) in 65 categories with distant domain shifts. Vis DA-17 (Peng et al., 2017) is a synthetic-to-real object recognition dataset including a source domain with 152k synthetic images and a target domain with 55k real images from Microsoft COCO. Domain Net (Peng et al., 2019) is the largest DA dataset containing 345 classes in 6 domains.
Dataset Splits No The paper mentions using "simple validation conducted on the Office-31 dataset" for hyperparameter setting but does not provide specific percentages or counts for training/validation/test splits across any of the datasets used.
Hardware Specification No The paper does not explicitly describe the hardware specifications used to run its experiments.
Software Dependencies No The paper mentions using models like ResNet-50 and ResNet-101 but does not specify software dependencies with version numbers (e.g., Python, PyTorch, or specific library versions).
Experiment Setup Yes The hyperparameters, training epochs, learning rates, and optimizers are set according to the default configurations provided in their original papers. We set the hyperparameters τ = 3 and ζ = 0.01 based on simple validation conducted on the Office-31 dataset, which performs well across all other datasets. We list all the hyper-parameters of our baseline methods in Tab. 6. We follow the original papers to set these hyper-parameters, except that we obtain some better hyper-parameters in terms of performances which are listed in the table. Note that the LR indicates the starting learning rate and the decay strategy is as same as those of the original papers.