ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation

Authors: Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on four widely used benchmarks demonstrate that our proposed method achieves state-ofthe-art performance in both classification and segmentation CTTA tasks. Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions.
Researcher Affiliation Collaboration Jiaming Liu1, Senqiao Yang1 , Peidong Jia 1 , Renrui Zhang 3, Ming Lu1, Yandong Guo2, Wei Xue4, Shanghang Zhang1 1National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University 2AI2Robotics 3 The Chinese University of Hong Kong 4 Hong Kong University of Science and Technology
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks. The methods are described in narrative text and illustrated with figures.
Open Source Code No The footnote on the first page states: "Project page: https://sites.google.com/view/iclr2024-vida/home". Upon visiting the project page, it states: "Our code will be released soon.", indicating that the code is not yet publicly available.
Open Datasets Yes We evaluate our method on three classification CTTA benchmarks, including CIFAR10-to-CIFAR10C, CIFAR100-to-CIFAR100C (Krizhevsky et al., 2009) and Image Net-to-Image Net-C (Hendrycks & Dietterich, 2019). For segmentation CTTA (Yang et al., 2023b), we evaluate our method on Cityscapes-to-ACDC, where the Cityscapes dataset (Cordts et al., 2016) serves as the source domain, and the ACDC dataset (Sakaridis et al., 2021) represents the target domains.
Dataset Splits No The paper describes a Continual Test-Time Adaptation (CTTA) setting where models are continually adapted to changing target domains over time, rather than using traditional fixed train/validation/test splits. It does not mention explicit validation dataset splits.
Hardware Specification No The paper does not specify the hardware used for running experiments, such as specific GPU models, CPU models, or memory details.
Software Dependencies No The paper mentions using Adam optimizer but does not provide specific version numbers for any software dependencies or libraries used for the experiments.
Experiment Setup Yes We adopt Vi T-base (Dosovitskiy et al., 2020) and Res Net (He et al., 2016) as the backbone in the classification CTTA. In the case of Vi T-base, we resize the input images to 224x224, while maintaining the original image resolution for other backbones. For segmentation CTTA, we adopt the pre-trained Segformer-B5 model (Xie et al., 2021) as the source model. We down-sample the input size from 1920x1080 to 960x540 for target domain data. The optimizer is performed using Adam (Kingma & Ba, 2014) with (β1, β2) = (0.9, 0.999). We set the learning rates to specific values for each task: 1e-4 for CIFAR10C, 5e-7 for Image Net C, and 3e-4 for ACDC.