Symmetric Self-Paced Learning for Domain Generalization

Authors: Di Zhao, Yun Sing Koh, Gillian Dobbie, Hongsheng Hu, Philippe Fournier-Viger

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments across five popular benchmark datasets demonstrate the effectiveness of the proposed learning strategy. Experiments conducted on five benchmark datasets, including Digits, PACS, Office-Home, VLCS, and NICO++, demonstrate the effectiveness of the proposed learning strategy. Ablation studies further validate the effectiveness and robustness of the proposed training scheduler and difficulty measure in domain generalization.
Researcher Affiliation Academia Di Zhao 1, Yun Sing Koh1, Gillian Dobbie1, Hongsheng Hu2, Philippe Fournier-Viger3 1 School of Computer Science, University of Auckland 2 CSIRO s Data61 3 College of Computer Science and Software Engineering Shenzhen University
Pseudocode Yes The structure of SSPL is illustrated in Figure 2, and the algorithm is summarized in Algorithm 1.
Open Source Code Yes The code is available in https://github.com/RobustMM/VIGIL.
Open Datasets Yes The proposed approach is evaluated on five popular domain generalization benchmark datasets, which cover a variety of image classification problems. (1) Digits (Zhou et al. 2020) consists of four digit recognition tasks, namely MNIST (Le Cun et al. 1998), MNIST-M (Ganin and Lempitsky 2015), SVHN (Netzer et al. 2011), and SYN (Ganin and Lempitsky 2015). (2) PACS (Li et al. 2017b) consists of four domains... (3) Office-Home (Venkateswara et al. 2017)... (4) VLCS (Fang, Xu, and Rockmore 2013)... (5) NICO++ (Zhang et al. 2023)...
Dataset Splits No The paper describes a 'leave-one-out-test evaluation strategy' where remaining domains are used as source domains for training and one for testing. It does not explicitly mention a separate validation split used for hyperparameter tuning or model selection during training, distinct from the test set.
Hardware Specification Yes All experiments are conducted on NVIDIA Tesla A100 GPUs.
Software Dependencies No The paper mentions 'Our methodology is implemented using the Py Torch libraries.' but does not specify the version number of PyTorch or any other software dependencies.
Experiment Setup Yes The optimizer utilized for training is Stochastic Gradient Descent (SGD), with a momentum of 0.9 and a weight decay of 5e-4. For the Digits dataset, we train the networks with an initial learning rate of 0.05 and a batch size of 64 for 50 epochs. The learning rate is decayed by a factor of 0.1 every 20 epochs. For the PACS, Office Home, and VLCS datasets, the networks are trained with a learning rate of 0.01 and a batch size of 32 for 50 epochs. For the NICO++ dataset, the networks are trained with a learning rate of 0.005 and a batch size of 64 for 50 epochs.