Test-Time Domain Adaptation by Learning Domain-Aware Batch Normalization

Authors: Yanan Wu, Zhixiang Chi, Yang Wang, Konstantinos N. Plataniotis, Songhe Feng

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our method outperforms the prior works on five WILDS real-world domain shift datasets. Our method can also be integrated with methods with label-dependent optimization to further push the performance boundary. Our code is available at https://github.com/ynanwu/MABN.
Researcher Affiliation Academia Yanan Wu1,2*, Zhixiang Chi3*, Yang Wang4, Konstantinos N. Plataniotis3, Songhe Feng1,2 1Key Laboratory of Big Data & Artificial Intelligence in Transportation, Ministry of Education, Beijing Jiaotong University, Beijing, 100044, China 2School of Computer and Information Technology, Beijing Jiaotong University, Beijing, 100044, China 3The Edward S Rogers Sr. ECE Department, University of Toronto, Toronto, M5S3G8, Canada 4Department of Computer Science and Software Engineering, Concordia University, Montreal, H3G2J1, Canada
Pseudocode Yes Algorithm 1: Meta-auxiliary training of MABN
Open Source Code Yes Our code is available at https://github.com/ynanwu/MABN.
Open Datasets Yes In this work, follow Meta-DMo E (Zhong et al. 2022) to evaluate our method on five benchmarks from WILDS (Koh et al. 2021): i Wild Cam (Beery et al. 2021), Camelyon17 (Bandi et al. 2018), Rx Rx1 (Taylor et al. 2019), FMo W (Christie et al. 2018) and Poverty Map (Yeh et al. 2020).
Dataset Splits Yes Note, that we follow the official training/validation/testing splits, and report the same metrics as in (Koh et al. 2021)
Hardware Specification No The paper mentions 'computational cost' and model architecture (e.g., ResNet50, DenseNet121, ResNet18) but does not provide specific details on the hardware (GPU, CPU, memory) used for running the experiments.
Software Dependencies No The paper mentions software components like 'ImageNet-1K pre-trained weights', 'Adam optimizer', and specific models (ResNet50, DenseNet121, ResNet18, BYOL), but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes We follow (Zhong et al. 2022) to use Image Net-1K (Deng et al. 2009) pre-trained weights as the initialization to perform joint training. Adam optimizer is used to minimize Eq. 2 with a learning rate (LR) of 1e 4 for 20 epochs. LR is reduced by a factor of 2 when the loss reaches a plateau. λ in Eq. 2 is set to 0.1. During metaauxiliary training, we fix the weight matrix of the entire network, and directly use the running statistics µ and σ for the BN layers. Only the affine parameters γ and β of the BN layers are further optimized using Alg. 1 for 10 epochs with fixed LR of 3e 4 for α and 3e 5 for δ. During testing, for each target domain, we randomly sample 12 images for i Wild Cam and 32 images for the rest datasets to perform adaptation first (Line 12-13 of Alg. 1). The adapted model is then used to test all the images in that domain. The same process is repeated for all target domains. All the experiments are conducted with 5 random seeds to show the variation.