reproducibilityindex.ai

Unraveling Batch Normalization for Realistic Test-Time Adaptation

Authors: Zixian Su, Jingwei Guo, Kai Yao, Xi Yang, Qiufeng Wang, Kaizhu Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments exhibit consistent improvement and demonstrate remarkable stability. We evaluate our approach on CIFAR-10-C, CIFAR-100-C, and Image Net-C (Hendrycks and Dietterich 2018)... Table 1,2 show the error rates on three corruption benchmark datasets under a continual evaluation setting. In Table 3, we report the error rates on CIFAR-10-C under mixed-domain adaptation and gradual changing shifts, respectively. Our method consistently outperforms other approaches.
Researcher Affiliation	Academia	Zixian Su1, 2, Jingwei Guo1, 2, Kai Yao1, 2, Xi Yang1, Qiufeng Wang1, Kaizhu Huang3 1School of Advanced Technology, Xi an Jiaotong-Liverpool University, Suzhou, China 2Faculty of Science and Engineering, University of Liverpool, Liverpool, the United Kingdom 3Data Science Research Center, Duke Kunshan University, Kunshan, China
Pseudocode	Yes	Algorithm 1: Layer-wise Rectification Strategy
Open Source Code	Yes	Code is available at https://github.com/kiwi12138/Realistic TTA.
Open Datasets	Yes	We evaluate our approach on CIFAR-10-C, CIFAR-100-C, and Image Net-C (Hendrycks and Dietterich 2018)
Dataset Splits	No	The paper states it uses pre-trained models ('pretrained by Hendrycks et al. (2020)', 'standard pretrained') and focuses on test-time adaptation, thus it does not explicitly describe training/validation/test splits for its own experimental setup, only mentioning test image counts.
Hardware Specification	Yes	All experiments are conducted on an RTX-3090 GPU.
Software Dependencies	No	The paper mentions deep neural network frameworks and models (e.g., ResNet, BN), but it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA, which are necessary for full reproducibility.
Experiment Setup	Yes	we conduct experiments with test batch sizes of 200, 64, 16, 4, 2, and 1 for CIFAR-10/100-C, and 64, 16, 4, 2, and 1 for Image Net-C. In Algorithm 1, γ is set as 1/2 and τ is set as 0.1. We respectively fix ϵ and λ at 0.1 and 0.01 for the sake of simplicity. Momentum is typically selected from the set {1, 0.1, 0.01, 0.001}.