Unraveling Batch Normalization for Realistic Test-Time Adaptation

Authors: Zixian Su, Jingwei Guo, Kai Yao, Xi Yang, Qiufeng Wang, Kaizhu Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments exhibit consistent improvement and demonstrate remarkable stability. We evaluate our approach on CIFAR-10-C, CIFAR-100-C, and Image Net-C (Hendrycks and Dietterich 2018)... Table 1,2 show the error rates on three corruption benchmark datasets under a continual evaluation setting. In Table 3, we report the error rates on CIFAR-10-C under mixed-domain adaptation and gradual changing shifts, respectively. Our method consistently outperforms other approaches.
Researcher Affiliation Academia Zixian Su1, 2, Jingwei Guo1, 2, Kai Yao1, 2, Xi Yang1*, Qiufeng Wang1, Kaizhu Huang3* 1School of Advanced Technology, Xi an Jiaotong-Liverpool University, Suzhou, China 2Faculty of Science and Engineering, University of Liverpool, Liverpool, the United Kingdom 3Data Science Research Center, Duke Kunshan University, Kunshan, China
Pseudocode Yes Algorithm 1: Layer-wise Rectification Strategy
Open Source Code Yes Code is available at https://github.com/kiwi12138/Realistic TTA.
Open Datasets Yes We evaluate our approach on CIFAR-10-C, CIFAR-100-C, and Image Net-C (Hendrycks and Dietterich 2018)
Dataset Splits No The paper states it uses pre-trained models ('pretrained by Hendrycks et al. (2020)', 'standard pretrained') and focuses on test-time adaptation, thus it does not explicitly describe training/validation/test splits for its own experimental setup, only mentioning test image counts.
Hardware Specification Yes All experiments are conducted on an RTX-3090 GPU.
Software Dependencies No The paper mentions deep neural network frameworks and models (e.g., ResNet, BN), but it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA, which are necessary for full reproducibility.
Experiment Setup Yes we conduct experiments with test batch sizes of 200, 64, 16, 4, 2, and 1 for CIFAR-10/100-C, and 64, 16, 4, 2, and 1 for Image Net-C. In Algorithm 1, γ is set as 1/2 and τ is set as 0.1. We respectively fix ϵ and λ at 0.1 and 0.01 for the sake of simplicity. Momentum is typically selected from the set {1, 0.1, 0.01, 0.001}.