ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
Authors: Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on four widely used benchmarks demonstrate that our proposed method achieves state-ofthe-art performance in both classification and segmentation CTTA tasks. Note that, our method can be regarded as a novel transfer paradigm for large-scale models, delivering promising results in adaptation to continually changing distributions. |
| Researcher Affiliation | Collaboration | Jiaming Liu1, Senqiao Yang1 , Peidong Jia 1 , Renrui Zhang 3, Ming Lu1, Yandong Guo2, Wei Xue4, Shanghang Zhang1 1National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University 2AI2Robotics 3 The Chinese University of Hong Kong 4 Hong Kong University of Science and Technology |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. The methods are described in narrative text and illustrated with figures. |
| Open Source Code | No | The footnote on the first page states: "Project page: https://sites.google.com/view/iclr2024-vida/home". Upon visiting the project page, it states: "Our code will be released soon.", indicating that the code is not yet publicly available. |
| Open Datasets | Yes | We evaluate our method on three classification CTTA benchmarks, including CIFAR10-to-CIFAR10C, CIFAR100-to-CIFAR100C (Krizhevsky et al., 2009) and Image Net-to-Image Net-C (Hendrycks & Dietterich, 2019). For segmentation CTTA (Yang et al., 2023b), we evaluate our method on Cityscapes-to-ACDC, where the Cityscapes dataset (Cordts et al., 2016) serves as the source domain, and the ACDC dataset (Sakaridis et al., 2021) represents the target domains. |
| Dataset Splits | No | The paper describes a Continual Test-Time Adaptation (CTTA) setting where models are continually adapted to changing target domains over time, rather than using traditional fixed train/validation/test splits. It does not mention explicit validation dataset splits. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments, such as specific GPU models, CPU models, or memory details. |
| Software Dependencies | No | The paper mentions using Adam optimizer but does not provide specific version numbers for any software dependencies or libraries used for the experiments. |
| Experiment Setup | Yes | We adopt Vi T-base (Dosovitskiy et al., 2020) and Res Net (He et al., 2016) as the backbone in the classification CTTA. In the case of Vi T-base, we resize the input images to 224x224, while maintaining the original image resolution for other backbones. For segmentation CTTA, we adopt the pre-trained Segformer-B5 model (Xie et al., 2021) as the source model. We down-sample the input size from 1920x1080 to 960x540 for target domain data. The optimizer is performed using Adam (Kingma & Ba, 2014) with (β1, β2) = (0.9, 0.999). We set the learning rates to specific values for each task: 1e-4 for CIFAR10C, 5e-7 for Image Net C, and 3e-4 for ACDC. |