TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation
Authors: Hyesu Lim, Byeonggeun Kim, Jaegul Choo, Sungha Choi
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that TTN outperforms existing TTA methods in realistic evaluation settings, i.e., with a wide range of test batch sizes for single, mixed, and continuously changing domain adaptation through extensive experiments on image classification and semantic segmentation tasks. |
| Researcher Affiliation | Collaboration | 1Qualcomm AI Research , 2KAIST |
| Pseudocode | Yes | Pseudocode for post-training, i.e., obtaining A and optimizing α, is provided in Algorithms 1 and 2, respectively, and that for test time is in Algorithm 3. Moreover, we provide Py Torch-friendly pseudocode for obtaining A in Listing 1. |
| Open Source Code | No | Together with related references and publicly available codes, we believe our paper contains sufficient information for reimplementation. This statement implies existing public code for *related* work, not a direct release or link to their specific implementation. |
| Open Datasets | Yes | We use corruption benchmark datasets CIFAR-10/100-C and Image Net-C, which consist of 15 types of common corruptions at five severity levels (Hendrycks & Dietterich, 2018). |
| Dataset Splits | No | The paper mentions training and testing/evaluation datasets but does not explicitly describe a separate validation set for model tuning or early stopping during training. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions using Adam optimizer and provides Py Torch-friendly pseudocode, but it does not specify version numbers for PyTorch or any other software libraries or dependencies used for the experiments. |
| Experiment Setup | Yes | For CIFAR-10/100-C, we optimized α using augmented CIFAR-10/100 training set on the pre-trained Wide Res Net-40-2 (WRN-40) (Hendrycks et al., 2019). For Image Net C, we used augmented Image Net training set (randomly sampled 64000 instances per epoch) on the pre-trained Res Net-50. ... We used Adam (Kingma & Ba, 2015) optimizer using a learning rate (LR) of 1e-3, which is decayed with cosine schedule (Loshchilov & Hutter, 2017) for 30 epochs and used 200 training batch for CIFAR-10/100. For Image Net, we lowered LR to 2.5e-4 and used 64 batch size. ... We used the weighting hyperparmeter to MSE loss λ as 1. ... For SWR, we set the importance of the regularization term λr as 500. |