Deja Vu: Continual Model Generalization for Unseen Domains

Authors: Chenxi Liu, Lixu Wang, Lingjuan Lyu, Chen Sun, Xiao Wang, Qi Zhu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on Digits, PACS, and Domain Net demonstrate that Ra TP significantly outperforms state-of-the-art works from Continual DA, Source-Free DA, Test-Time/Online DA, Single DG, Multiple DG and Unified DA&DG in TDG, and achieves comparable TDA and FA capabilities.
Researcher Affiliation Collaboration Chenxi Liu1 , Lixu Wang1 , Lingjuan Lyu2 , Chen Sun2, Xiao Wang1, Qi Zhu1 1Northwestern University, 2Sony AI {chenxiliu2020,lixuwang2025}@u.northwestern.edu, {Lingjuan.Lv,chen.sun}@sony.com, {wangxiao,qzhu}@northwestern.edu
Pseudocode No The paper describes algorithms and modules in text and diagrams (Figure 1) but does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code Yes Our code is available at https://github.com/Sony AI/Ra TP.
Open Datasets Yes Digits consists of 5 different domains with 10 classes, including MNIST (MT), SVHN (SN), MNIST-M (MM), SYN-D (SD) and USPS (US). PACS contains 4 domains with 7 classes, including Photo (P), Art painting (A), Cartoon (C), and Sketch (S). Domain Net is the most challenging cross-domain dataset, including Quickdraw (Qu), Clipart (Cl), Painting (Pa), Infograph (In), Sketch (Sk) and Real (Re). Considering label noise and class imbalance, we follow Xie et al. (2022) to split a subset of Domain Net for our experiments.
Dataset Splits No In the source domain, we randomly split 80% data as the training set and the rest 20% as the testing set. In target domains, all data are used for training and testing.
Hardware Specification No The paper does not explicitly specify the hardware used for experiments, such as particular GPU or CPU models, or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes The SGD optimizer with an initial learning rate of 0.01 is used for Digits, and 0.005 for PACS and Domain Net. The exemplar memory size is set as 200 for all datasets, and the batch size is 64. For all experiments, we conduct multiple runs with three seeds (2022, 2023, 2024), and report the average performance. We use 800 steps for Digits, 50 steps for PACS and 70 steps for Domain Net. Training epoch is 30 for all datasets and domains. ... we use momentum of 0.9 and weight decay of 0.0005 to schedule the SGD.