reproducibilityindex.ai

Persistent Test-time Adaptation in Recurring Testing Scenarios

Authors: Trung Hieu Hoang, MinhDuc Vo, Minh Do

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The supreme stability of Pe TTA over existing approaches, in the face of lifelong TTA scenarios, has been demonstrated over comprehensive experiments on various benchmarks. Our project page is available at https://hthieu166.github.io/petta. 5 Experimental Results
Researcher Affiliation	Academia	Trung-Hieu Hoang1 Duc Minh Vo2 Minh N. Do1,3 1Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign 2The University of Tokyo 3Vin Uni-Illinois Smart Health Center, Vin University {hthieu, minhdo}@illinois.edu vmduc@nlab.ci.i.u-tokyo.ac.jp
Pseudocode	Yes	Appdx. E.1 introduces the pseudo code of Pe TTA.
Open Source Code	Yes	Our project page is available at https://hthieu166.github.io/petta. The source code of Pe TTA is also attached as supplemental material.
Open Datasets	Yes	Specifically, CIFAR10 CIFAR10-C, CIFAR100 CIFAR100-C, and Image Net Image Net-C [19] are three corrupted images classification tasks... Additionally, we incorporate Domain Net [44]... All the datasets, including CIFAR-10-C, CIFAR100-C and Image Net-C [19] are publicly available online, released under Apache-2.0 license.7
Dataset Splits	Yes	Following the practical TTA setup, multiple testing scenarios from each testing set will gradually change from one to another while the Dirichlet distribution (Dir(0.1) for CIFAR10C, Domain Net, and Image Net-C, and Dir(0.01) for CIFAR100-C) generates category temporally correlated batches of data. For evaluation, an independent set of 2000 samples following the same distribution is used for computing the prediction frequency, and the false negative rate (FNR).
Hardware Specification	Yes	A computer cluster equipped with an Intel(R) Core(TM) 3.80GHz i7-10700K CPU, 64 GB RAM, and one NVIDIA Ge Force RTX 3090 GPU (24 GB VRAM) is used for our experiments.
Software Dependencies	No	We use Py Torch [43] for implementation. Robust Bench [10] and torchvision [35] provide pre-trained source models.
Experiment Setup	Yes	Unless otherwise noted, for all Pe TTA experiments, the EMA update rate for robust batch normalization [61] and feature embedding statistics is set to 5e 2; α0 = 1e 3 and cosine similarity regularizer is used. On CIFAR10/100-C and Image Net-C we use the self-training loss in [12] for LCLS and λ0 = 10 while the regular cross-entropy loss [13] and λ0 = 1 (severe domain shift requires prioritizing adaptability) are applied in Domain Net experiments.