Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization

Authors: Haiyang Yang, Xiaotong Li, SHIXIANG TANG, Feng Zhu, Yizhou Wang, Meilin Chen, LEI BAI, Rui Zhao, Wanli Ouyang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Qualitative results on extensive datasets illustrate our method improves the state-of-the-art unsupervised domain generalization methods by average +5.59%, +4.52%, +4.22%, +7.02% on 1%, 5%, 10%, 100% PACS, and +5.08%, +6.49%, +1.79%, +0.53% on 1%, 5%, 10%, 100% Domain Net, respectively. Massive experiments are conducted on the commonly used multi-domain UDG benchmarks, including PACS (Li et al., 2017) and Domain Net (Peng et al., 2019).
Researcher Affiliation Collaboration 1Nanjing University 2Peking University 3The University of Sydney 4Zhejiang University 5Sensetime Research 6Qing Yuan Research Institute 7Shanghai AI Laboratory
Pseudocode No The paper describes its processes in text and uses figures to illustrate the architecture, but it does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement about releasing code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets Yes PACS, proposed by (Li et al., 2017), is a widely used benchmark for domain generalization. It consists of four domains, including Photo (1,670 images), Art Painting (2,048 images), Cartoon (2,344 images), and Sketch (3,929 images) and each domain contains seven categories. (Peng et al., 2019) proposes a large and diverse cross-domain benchmark Domain Net, which contains 586,575 examples with 345 object classes, including six domains: Real, Painting, Sketch, Clipart, Infograph, and Quickdraw.
Dataset Splits Yes Second, we use a different number of labeled training examples of the validation subset in the source domains to finetune the classifier or the whole backbone. In detail, when the fraction of the labeled finetuning data is lower than 10% of the whole validation subset in the source domains, we only finetune the linear classifier for all the methods. When the fraction of labeled finetuning data is larger than 10% of the whole validation subset in the source domains, we finetune the whole network, including the backbone and the classifier.
Hardware Specification No The paper does not mention specific hardware models (GPU, CPU, or specific cloud instances) used for running its experiments. It only refers to model architectures like 'Vi T-small' or 'Vi T tiny' as the backbone network.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9).
Experiment Setup Yes The learning rate for pre-training is 1.5 10 5 and then decays with a cosine decay schedule. The weight decay is set to 0.05 and the batch size is set to 256 Nd, where Nd is the number of domains in the training set. All methods are pre-trained for 1000 epochs, which is consistent with the implementations in (Zhang et al., 2022) for fair comparisons. The feature dimension is set to 1024. For finetuning, we follow the exact training schedule as that in (Zhang et al., 2022). We use a MAE (He et al., 2021) unsupervised pre-training model in Image Net for 1600 epochs to ensure labels are not available during the whole pretraining process. We set α = 2 and β = 2.