Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond
Authors: Haoxiang Wang, Bo Li, Han Zhao
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. Empirical results shown in Fig. 3 are consistent with the theoretical prediction of our generalization bound that we discuss in Sec. 4.4: as the source and target are fixed, along a chosen path of intermediate domains (e.g., counter-clockwise rotation in Rotated MNIST from 0 to 60 degrees), the target domain test error decreases then increases, indicating the existence of an optimal choice of T for each n of consideration. |
| Researcher Affiliation | Academia | 1University of Illinois at Urbana-Champaign, Urbana, IL, USA. |
| Pseudocode | No | The paper describes the gradual self-training algorithm using mathematical equations (e.g., Eq. 5) but does not present it in a pseudocode block or a clearly labeled algorithm section. |
| Open Source Code | Yes | Code. Our code is provided in https://github.com/Haoxiang-Wang/gradual-domain-adaptation. |
| Open Datasets | Yes | Color-Shift MNIST: We normalize the pixel value of each MNIST image from [0,255] to [0,1]. Rotated MNIST: A semi-synthetic dataset rotating MNIST images by an angle between 0 and 60 degrees. Cover Type (Blackard & Dean, 1999): A tabular dataset hosted by the UCI repository (Dua & Graff, 2017). Portraits (Ginosar et al., 2015): An image dataset of grayscale photos of high school seniors from 1905 to 2013 (Fig. 1). |
| Dataset Splits | Yes | Rotated MNIST: The 50K training set images of MNIST are divided into a source domain of 5K images (no rotation), intermediate domains of 42K images (0-60 degrees), and a set of validation data of the rest images. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like Keras and TensorFlow and optimizers like Adam, but it does not specify exact version numbers for these software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | Following Kumar et al. (2020), we use the architecture of 3-layer Re LU MLP with Batch Norm (Ioffe & Szegedy, 2015) and Dropout(0.5) (Srivastava et al., 2014), and apply it to all datasets. We use the cross-entropy loss and the Adam optimizer (Kingma & Ba, 2015) following the practices of Kumar et al. (2020). |