Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PAID: Pairwise Angular-Invariant Decomposition for Continual Test-Time Adaptation

Authors: Kunyu Wang, Xueyang Fu, Yuanfei Bao, Chengjie Ge, Chengzhi Cao, Wei Zhai, Zheng-Jun Zha

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on three classification CTTA benchmarks: CIFAR10-to CIFAR10C, CIFAR100-to-CIFAR100C [23], and Image Net-to-Image Net-C [14]. In the classification tasks, we follow the sequential adaptation process described in [51], where the pre-trained source model adapts to each of the 15 target domains, each defined by the highest corruption severity. Online prediction results are immediately assessed after processing the input. For segmentation CTTA, we assess our method on Cityscapes-to-ACDC, where Cityscapes [3] serves as the source domain and ACDC [49] as the target domains, which includes images captured under four distinct unobserved visual conditions: Fog, Night, Rain, and Snow.
Researcher Affiliation	Academia	Kunyu Wang, Xueyang Fu, Yuanfei Bao, Chengjie Ge, Chengzhi Cao, Wei Zhai, Zheng-Jun Zha University of Science and Technology of China EMAIL EMAIL
Pseudocode	No	The paper describes the methodology using mathematical equations and textual explanations, but it does not contain any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Our code is available at https://github. com/wangkunyu241/PAID.
Open Datasets	Yes	We evaluate our method on three classification CTTA benchmarks: CIFAR10-to CIFAR10C, CIFAR100-to-CIFAR100C [23], and Image Net-to-Image Net-C [14]. In the classification tasks, we follow the sequential adaptation process described in [51], where the pre-trained source model adapts to each of the 15 target domains, each defined by the highest corruption severity. Online prediction results are immediately assessed after processing the input. For segmentation CTTA, we assess our method on Cityscapes-to-ACDC, where Cityscapes [3] serves as the source domain and ACDC [49] as the target domains, which includes images captured under four distinct unobserved visual conditions: Fog, Night, Rain, and Snow.
Dataset Splits	Yes	In the classification tasks, we follow the sequential adaptation process described in [51], where the pre-trained source model adapts to each of the 15 target domains, each defined by the highest corruption severity. Online prediction results are immediately assessed after processing the input. For segmentation CTTA, we assess our method on Cityscapes-to-ACDC, where Cityscapes [3] serves as the source domain and ACDC [49] as the target domains, which includes images captured under four distinct unobserved visual conditions: Fog, Night, Rain, and Snow. To simulate continual environmental changes, we cyclically iterate through the same sequence of target domains (Fog Night Rain Snow) three rounds, reflecting real-world scenarios. ... We pre-compute the source domain statistics (µs, σs) using a randomly sampled subset of 500 images from the source domain Ds.
Hardware Specification	Yes	The experiments are conducted on the NVIDIA RTX 3090 GPU.
Software Dependencies	No	The Adam W [35] optimizer is used with parameters (β1, β2) = (0.9, 0.999). No specific versions for general software dependencies like programming languages or deep learning frameworks are mentioned.
Experiment Setup	Yes	Hyperparameters for CIFAR10-C, CIFAR100-C, Image Net-C, and ACDC are set as follows: batch size {64, 64, 64, 1}, orthogonal matrix coefficient {12, 12, 12, 12}, and loss coefficient {1.0, 1.0, 0.1, 1.0}. To initialize the learnable parameters, we perform warm-up iterations on classification datasets. For linear layer selection, we inject all linear layers, including those in the multi-head attention block (q, k, v, o) and the MLP block (m). The number of source examples is set to 500.