Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

Authors: Haoxiang Wang, Bo Li, Han Zhao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. Empirical results shown in Fig. 3 are consistent with the theoretical prediction of our generalization bound that we discuss in Sec. 4.4: as the source and target are fixed, along a chosen path of intermediate domains (e.g., counter-clockwise rotation in Rotated MNIST from 0 to 60 degrees), the target domain test error decreases then increases, indicating the existence of an optimal choice of T for each n of consideration.
Researcher Affiliation Academia 1University of Illinois at Urbana-Champaign, Urbana, IL, USA.
Pseudocode No The paper describes the gradual self-training algorithm using mathematical equations (e.g., Eq. 5) but does not present it in a pseudocode block or a clearly labeled algorithm section.
Open Source Code Yes Code. Our code is provided in https://github.com/Haoxiang-Wang/gradual-domain-adaptation.
Open Datasets Yes Color-Shift MNIST: We normalize the pixel value of each MNIST image from [0,255] to [0,1]. Rotated MNIST: A semi-synthetic dataset rotating MNIST images by an angle between 0 and 60 degrees. Cover Type (Blackard & Dean, 1999): A tabular dataset hosted by the UCI repository (Dua & Graff, 2017). Portraits (Ginosar et al., 2015): An image dataset of grayscale photos of high school seniors from 1905 to 2013 (Fig. 1).
Dataset Splits Yes Rotated MNIST: The 50K training set images of MNIST are divided into a source domain of 5K images (no rotation), intermediate domains of 42K images (0-60 degrees), and a set of validation data of the rest images.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions software like Keras and TensorFlow and optimizers like Adam, but it does not specify exact version numbers for these software dependencies, which are required for reproducibility.
Experiment Setup Yes Following Kumar et al. (2020), we use the architecture of 3-layer Re LU MLP with Batch Norm (Ioffe & Szegedy, 2015) and Dropout(0.5) (Srivastava et al., 2014), and apply it to all datasets. We use the cross-entropy loss and the Adam optimizer (Kingma & Ba, 2015) following the practices of Kumar et al. (2020).