Test-Time Adaptation via Self-Training with Nearest Neighbor Information
Authors: Minguk Jang, Sae-Young Chung, Hye Won Chung
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | TAST showed better performance than the state-of-the-art TTA methods on two standard benchmark tasks, domain generalization, namely VLCS, PACS, Office Home, and Terra Incognita, and image corruption, particularly CIFAR-10/100C. Our code is available at https://github.com/mingukjang/TAST. |
| Researcher Affiliation | Academia | Minguk Jang, Sae-Young Chung, Hye Won Chung School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Republic of Korea {mgjang, schung, hwchung}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 Test-time Adaptation via Self-Training with nearest neighbor information (TAST) |
| Open Source Code | Yes | Our code is available at https://github.com/mingukjang/TAST. |
| Open Datasets | Yes | We test TAST on four domain generalization benchmarks, specifically VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), and Terra Incognita (Beery et al., 2018). For a fair comparison, we follow the training setup including dataset splits and hyperparameter selection method used in T3A. |
| Dataset Splits | Yes | We split each dataset of training domains into training and validation sets. The training and validation sets are used for network training and hyperparameter selection, respectively. Specifically, we split each dataset into 80% and 20% and use the smaller set as the validation set. We choose the hyperparameters that maximize the validation accuracy of the adapted classifier. |
| Hardware Specification | Yes | We conduct our experiments on TITAN XP. |
| Software Dependencies | No | The paper mentions software components like 'Adam optimizer', 'SGD optimizer', and 'Res Net-50', but does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | TAST involves four hyperparameters: the number of gradient steps per adaptation T, the number of support examples per each class M, the number of nearby support examples Ns, and the number of adaptation modules Ne. We define a finite set of possible values for each hyperparameter, Ns {1, 2, 4, 8}, T {1, 3}, and M {1, 5, 20, 50, 100, -1}, where -1 means to storing all samples without filtering. Ne is set to 20. We use Adam optimizer with a learning rate of 0.001. |