Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization
Authors: Haiyang Yang, Xiaotong Li, SHIXIANG TANG, Feng Zhu, Yizhou Wang, Meilin Chen, LEI BAI, Rui Zhao, Wanli Ouyang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Qualitative results on extensive datasets illustrate our method improves the state-of-the-art unsupervised domain generalization methods by average +5.59%, +4.52%, +4.22%, +7.02% on 1%, 5%, 10%, 100% PACS, and +5.08%, +6.49%, +1.79%, +0.53% on 1%, 5%, 10%, 100% Domain Net, respectively. Massive experiments are conducted on the commonly used multi-domain UDG benchmarks, including PACS (Li et al., 2017) and Domain Net (Peng et al., 2019). |
| Researcher Affiliation | Collaboration | 1Nanjing University 2Peking University 3The University of Sydney 4Zhejiang University 5Sensetime Research 6Qing Yuan Research Institute 7Shanghai AI Laboratory |
| Pseudocode | No | The paper describes its processes in text and uses figures to illustrate the architecture, but it does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing code for the work described, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | PACS, proposed by (Li et al., 2017), is a widely used benchmark for domain generalization. It consists of four domains, including Photo (1,670 images), Art Painting (2,048 images), Cartoon (2,344 images), and Sketch (3,929 images) and each domain contains seven categories. (Peng et al., 2019) proposes a large and diverse cross-domain benchmark Domain Net, which contains 586,575 examples with 345 object classes, including six domains: Real, Painting, Sketch, Clipart, Infograph, and Quickdraw. |
| Dataset Splits | Yes | Second, we use a different number of labeled training examples of the validation subset in the source domains to finetune the classifier or the whole backbone. In detail, when the fraction of the labeled finetuning data is lower than 10% of the whole validation subset in the source domains, we only finetune the linear classifier for all the methods. When the fraction of labeled finetuning data is larger than 10% of the whole validation subset in the source domains, we finetune the whole network, including the backbone and the classifier. |
| Hardware Specification | No | The paper does not mention specific hardware models (GPU, CPU, or specific cloud instances) used for running its experiments. It only refers to model architectures like 'Vi T-small' or 'Vi T tiny' as the backbone network. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | The learning rate for pre-training is 1.5 10 5 and then decays with a cosine decay schedule. The weight decay is set to 0.05 and the batch size is set to 256 Nd, where Nd is the number of domains in the training set. All methods are pre-trained for 1000 epochs, which is consistent with the implementations in (Zhang et al., 2022) for fair comparisons. The feature dimension is set to 1024. For finetuning, we follow the exact training schedule as that in (Zhang et al., 2022). We use a MAE (He et al., 2021) unsupervised pre-training model in Image Net for 1600 epochs to ensure labels are not available during the whole pretraining process. We set α = 2 and β = 2. |