reproducibilityindex.ai

Fairness and Accuracy under Domain Generalization

Authors: Thai-Hoang Pham, Xueru Zhang, Ping Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real-world data validate the proposed algorithm.
Researcher Affiliation	Academia	Thai-Hoang Pham, Xueru Zhang, Ping Zhang The Ohio State University, Columbus, OH 43210, USA {pham.375,zhang.12807,zhang.10631}@osu.edu
Pseudocode	Yes	Algorithm 1: Fairness and Accuracy Transfer by Density Matching (FATDM)
Open Source Code	Yes	Model implementation is available at https://github.com/pth1993/FATDM.
Open Datasets	Yes	The original chest X-ray images and the corresponding metadata can be downloaded from PhysioNet (https://physionet.org/content/mimic-cxr-jpg/2.0.0/; https: //physionet.org/content/mimiciv/2.0/).
Dataset Splits	Yes	We follow leave-one-out domain setting in which 3 domains are used for training and the remaining domain serves as the unseen target domain and is used for evaluation. ... 10% of training data is used for validation. Each model is trained with 10 epoches and the results are from the epoch with best performance on the validation set.
Hardware Specification	Yes	Models (FATDM and baselines) are implemented by PyTorch library version 1.11 and is trained on multiple computer nodes (each model instance is trained on a single node which has 4 CPUs, 8GB of memory, and a single GPU (P100 or V100)).
Software Dependencies	Yes	Models (FATDM and baselines) are implemented by PyTorch library version 1.11
Experiment Setup	Yes	ω (hyper-parameter that controls accuracy-fairness trade-off) varies from 0 to 10 with step sizes 0.0002 for ω [0, 0.002], 0.002 for ω [0.002, 0.1] and 0.2 for ω [0.2, 10], and γ (hyper-parameter that controls accuracy-invariance trade-off) is set to 0.1 (after hyper-parameter tuning). Models (FATDM and baselines) are implemented by PyTorch library version 1.11 and is trained on multiple computer nodes (each model instance is trained on a single node which has 4 CPUs, 8GB of memory, and a single GPU (P100 or V100)). One domain s data is used for testing and the other domains data is used for training (10% of training data is used for validation). Each model is trained with 10 epoches and the results are from the epoch with best performance on the validation set.