DELTA: DEGRADATION-FREE FULLY TEST-TIME ADAPTATION

Authors: Bowen Zhao, Chen Chen, Shu-Tao Xia

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We investigate various test-time adaptation methods on three commonly used datasets with four scenarios, and a newly introduced real-world dataset. DELTA can help them deal with all scenarios simultaneously, leading to SOTA performance.
Researcher Affiliation Collaboration Bowen Zhao1,2, Chen Chen3, , Shu-Tao Xia1,4, 1Tsinghua University, 2Tencent TEG AI, 3OPPO research institute, 4Peng Cheng Laboratory
Pseudocode Yes Algorithm 1: Dynamic Online reweigh Ting (DOT)... Algorithm 2: Test-time Batch Renormalization (TBR) module
Open Source Code Yes Code is available online.
Open Datasets Yes We conduct experiments on common datasets CIFAR100-C, Image Net C (Hendrycks & Dietterich, 2019), Image Net-R (Hendrycks et al., 2021), and a newly introduced video (segments) dataset: the subset of You Tube-Bounding Boxes (YTBB-sub) (Real et al., 2017).
Dataset Splits No No explicit statements specifying dataset splits (e.g., percentages or counts for training, validation, and test sets) were found in the paper's main text for reproducibility of data partitioning.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory, or specific cloud instances) used for running the experiments were provided in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8') were explicitly stated in the paper.
Experiment Setup Yes We use Adam optimizer with learning rate of 1e-3, batch size of 200 for CIFAR100-C; SGD optimizer with learning rate of 2.5e-4, batch size of 64 for Image Net-C/-R; SGD optimizer with learning rate of 2.5e-4, batch size of 200 for YTBB-sub. For DELTA, the hyper-parameters α and λ are roughly selected from {0.9, 0.95, 0.99, 0.999} on validation sets, e.g., the extra sets with corruption types outside the 15 types used in the benchmark. The smoothing coefficient α in TBR is set to 0.95 for CIFAR100-C and Image Net-C/-R, 0.999 for YTBB-sub, λ in DOT is set to 0.95 for Image Net-C/-R and 0.9 for CIFAR100-C / YTBB-sub.