Towards Combating Frequency Simplicity-biased Learning for Domain Generalization
Authors: Xilin He, Jingyu Hu, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Muhammad Haris Khan, Linlin Shen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first conduct cross-domain image classification and instance retrieval experiments to study the proposed method s performance. We compare our method with traditional data augmentation based methods [13, 33, 39], image-level style augmentation based methods [25, 37], feature-level style augmentation methods [45, 22, 44], two most relative frequency augmentation-based methods [38, 43] and other representative methods [23, 1]. Then we evaluate the frequency shortcuts following Wang s metric [35]. We further provide insights into models data-driven frequency characteristics with the proposed AAUA and AAD. We also carry out detailed ablation studies. |
| Researcher Affiliation | Academia | Computer Vision Institute, School of Computer Science & Software Engineering, Shenzhen University1 Shenzhen Institute of Artificial Intelligence and Robotics for Society2 Guangdong Provincial Key Laboratory of Intelligent Information Processing3 University of Exeter4, Mohamed bin Zayed University of Artificial Intelligence5 |
| Pseudocode | Yes | For clarity of narrative, the pseudo-code of generating augmentation samples and samples generated by AAUA and AAD are shown in the supplementary material. |
| Open Source Code | No | Code would be made publicly available after peer review. |
| Open Datasets | Yes | We evaluate the performance of the proposed method on three cross-domain image classification benchmarks of PACS [19], Digits and CIFAR-10-C [10]. We conduct the cross-domain instance retrieval task on person re-identification (re-ID) datasets of Market1501 [47] and Duke MTMC [27] with OSNet [48]. |
| Dataset Splits | Yes | Table 6: Experimental results on PACS dataset, where the listed domain is adopted for training and the reported results are evaluated on the remaining three domains. PT, AP, CT, SC denote the four domains of photo, art painting, cartoon and sketch, respectively. |
| Hardware Specification | Yes | All the experiments are run on a Nvidia A100 80GB. |
| Software Dependencies | No | The paper mentions using specific network architectures like ResNet-18 and OSNet and optimizers like SGD and Adam, but does not provide specific version numbers for software libraries such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | In all the experiments, we set the iterative optimization steps T of AAUA as 5. For a fair comparison, the number of augmented samples used in each iteration is set as 3, following [43]. We adopt the same network architectures in previous works [43, 44]. Please see the supplementary material for more implementation details. ... For Digits, we train a convolution network with the same architecture in [49] with a SGD optimizer for 60 epochs. The initial learning rate is 0.001, which decays by the ratio of 0.1 for every 20 epochs. ... For PACS [19], we train an Image Net pretrained Res Net-18[9] on the source domain with an Adam optimizer for 60 epochs. The initial learning rate is 0.0001, which follows a cosine annealing decay strategy. For CIFAR-10-C, we train a Wide Residual Network[42] with a width factor of 4 and 16 layers following previous works[10]. The network is optimized with a SGD optimizer for 120 epochs, with an initial learning rate of 0.1 which linearly decays by 0.1 every 40 epochs. ... In the task of cross-domain instance retrieval, we adopt the OSNet[48] pretrained on Image Net, following experiment settings in [45, 44]. The network is optimized with an Adam optimizer for 60 epochs, with an initial learning rate of 0.0003 which linearly decays by 0.1 every 20 epochs. |