Test-Time Adaptation via Self-Training with Nearest Neighbor Information

Authors: Minguk Jang, Sae-Young Chung, Hye Won Chung

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental TAST showed better performance than the state-of-the-art TTA methods on two standard benchmark tasks, domain generalization, namely VLCS, PACS, Office Home, and Terra Incognita, and image corruption, particularly CIFAR-10/100C. Our code is available at https://github.com/mingukjang/TAST.
Researcher Affiliation Academia Minguk Jang, Sae-Young Chung, Hye Won Chung School of Electrical Engineering Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Republic of Korea {mgjang, schung, hwchung}@kaist.ac.kr
Pseudocode Yes Algorithm 1 Test-time Adaptation via Self-Training with nearest neighbor information (TAST)
Open Source Code Yes Our code is available at https://github.com/mingukjang/TAST.
Open Datasets Yes We test TAST on four domain generalization benchmarks, specifically VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), and Terra Incognita (Beery et al., 2018). For a fair comparison, we follow the training setup including dataset splits and hyperparameter selection method used in T3A.
Dataset Splits Yes We split each dataset of training domains into training and validation sets. The training and validation sets are used for network training and hyperparameter selection, respectively. Specifically, we split each dataset into 80% and 20% and use the smaller set as the validation set. We choose the hyperparameters that maximize the validation accuracy of the adapted classifier.
Hardware Specification Yes We conduct our experiments on TITAN XP.
Software Dependencies No The paper mentions software components like 'Adam optimizer', 'SGD optimizer', and 'Res Net-50', but does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes TAST involves four hyperparameters: the number of gradient steps per adaptation T, the number of support examples per each class M, the number of nearby support examples Ns, and the number of adaptation modules Ne. We define a finite set of possible values for each hyperparameter, Ns {1, 2, 4, 8}, T {1, 3}, and M {1, 5, 20, 50, 100, -1}, where -1 means to storing all samples without filtering. Ne is set to 20. We use Adam optimizer with a learning rate of 0.001.