Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SNAP: Low-Latency Test-Time Adaptation with Sparse Updates

Authors: Hyeongheon Cha, Dong Min Kim, Hye Won Chung, Taesik Gong, Sung-Ju Lee

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Integrated with five state-of-the-art TTA algorithms, SNAP reduces latency by up to 93.12%, while keeping the accuracy drop below 3.3%, even across adaptation rates ranging from 1% to 50%. This demonstrates its strong potential for practical use on edge devices serving latency-sensitive applications. The source code is available at https://github.com/chahh9808/SNAP. Section 5, titled 'Experiments', further details the empirical studies conducted.
Researcher Affiliation	Academia	1School of Electrical Engineering, KAIST, Daejeon, Republic of Korea 2Department of Computer Science and Engineering, UNIST, Ulsan, Republic of Korea EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 Class and Domain Representative Memory (Cn DRM) Management
Open Source Code	Yes	The source code is available at https://github.com/chahh9808/SNAP.
Open Datasets	Yes	We used three standard TTA benchmarks: CIFAR10-C, CIFAR100-C and Image Net-C [10] for main evaluation. We also validate SNAP on Image Net-R [9] and Image Net-Sketch [49] to assess generalization (Appendix B.11).
Dataset Splits	Yes	We used three standard TTA benchmarks: CIFAR10-C, CIFAR100-C and Image Net-C [10] for main evaluation. These datasets include 15 different types of corruption with five levels of severity, and we used the highest one.
Hardware Specification	Yes	Latency was measured on three representative edge devices Raspberry Pi 4 [38], Raspberry Pi Zero 2 W [39], and NVIDIA Jetson Nano [32]. ... To ensure efficiency in experimentation, accuracy measurements were obtained using NVIDIA Ge Force RTX 3090 GPUs.
Software Dependencies	No	Tent [48] We update the BN affine parameters using the SGD optimizer [24]... Co TTA [50] We update all model parameters using the Adam optimizer [15]... Torchvision: Pytorch's computer vision library. https://github.com/pytorch/vision, 2016. The paper mentions software such as SGD, Adam, PyTorch, and Torchvision but does not specify their version numbers.
Experiment Setup	Yes	The main evaluation was run with diverse AR values: 0.01, 0.03, 0.05, 0.1, 0.3, and 0.5. We report the mean accuracy and standard deviation over three random seeds. ... test batch sizes were set to 16 for all baseline methods... Tent [48] We update the BN affine parameters using the SGD optimizer [24] with a learning rate of l = 1e 3 for CIFAR10/100C and l = 1e 4 for Image Net-C. ... The confidence threshold for Cn DRM τconf is set to 0.4 for CIFAR10-C, 0.45 for CIFAR100-C, and 0.5 for Image Net-C. ... the parameters for the soft shrinkage function in Io BMN are fixed with α = 4 for Tent, Co TTA, SAR, Ro TTA, and α = 2 for EATA.