reproducibilityindex.ai

Domain Decorrelation with Potential Energy Ranking

Authors: Sen Pei, Jiaxi Sun, Richard Yi Da Xu, Shiming Xiang, Gaofeng Meng

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Po ER reports superior performance on domain generalization benchmarks, improving the average top-1 accuracy by at least 1.20% compared to the existing methods. Moreover, we use Po ER in the ECCV 2022 NICO Challenge, achieving top place with only a vanilla Res Net-18 and winning the jury award. and Experiments Dataset We consider four benchmarks to evaluate the performance of our proposed Po ER, namely PACS (Li et al. 2017), VLCS (Ghifary et al. 2015a), Digits-DG (Zhou et al. 2021b), and Office-Home (Venkateswara et al. 2017). and Ablation Study
Researcher Affiliation	Academia	Sen Pei1,2, Jiaxi Sun1,2, Richard Yi Da Xu4, Shiming Xiang1,2, and Gaofeng Meng1,2,3* 1 NLPR, Institute of Automation, Chinese Academy of Sciences 2 School of Artificial Intelligence, University of Chinese Academy of Sciences 3 CAIR, HK Institute of Science and Innovation, Chinese Academy of Sciences 4 Hong Kong Baptist University
Pseudocode	Yes	Algorithm 1: Potential energy ranking for DG task.
Open Source Code	Yes	The code has been made publicly available at: https://github.com/Forever Ps/Po ER.
Open Datasets	Yes	We consider four benchmarks to evaluate the performance of our proposed Po ER, namely PACS (Li et al. 2017), VLCS (Ghifary et al. 2015a), Digits-DG (Zhou et al. 2021b), and Office-Home (Venkateswara et al. 2017). The datasets mentioned below can be downloaded at Dassl (Zhou et al. 2021a), which is a testing bed including many DG methods.
Dataset Splits	Yes	Office-Home contains images belonging to 65 categories within 4 domains which are artistic, clip art, product, and the real world. Following DDAIG (Zhou et al. 2020a), we randomly split the source domains into 90% for training and 10% for validation, reporting the metrics on the leave-oneout domain using the best-validated model. and VLCS... We randomly split the source domains into 70% for training and 30% for validation following (Ghifary et al. 2015a), reporting metrics on the target domain using the best-validated classifier. and NICO... we randomly split the data into 90% for training and 10% for validation, reporting metrics on the left domains with the best-validated model. and Digits-DG... Images are split into 90% for training and 10% for validation.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only mentions the model architectures used like "Res Net-18" and "Swin-Transformer (Tiny)" without detailing the computing hardware.
Software Dependencies	No	The paper mentions software components such as "Adam W optimizer" and model architectures like "Res Net-18", "Dense Net", and "Swin-Transformer". However, it does not specify exact version numbers for any programming languages, libraries, frameworks (e.g., PyTorch, TensorFlow), or other software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	The learning rate starts from 1e-4 and halves every 70 epochs. The batch size is set to 128. The hyper-parameter α in Eq.(10) is set to 0.1 during the first 70 epochs otherwise 0.2. Only the Random Horizontal Flip and Color Jitter are adopted as the data augmentation schemes. The Adam W optimizer is used for training. The mean-std normalization is used based on the Image Net statistics. GCPL (Yang et al. 2018) uses the same settings as stated above, and all other methods employ the default official settings. We store the models after the first 10 epochs based on the top-1 accuracy on the validation set. The number of prototypes n is set to 3.