reproducibilityindex.ai

FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data

Authors: Zhun Deng, Jiayao Zhang, Linjun Zhang, Ting Ye, Yates Coley, Weijie J Su, James Zou

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the power of FIFA by combining it with a popular fair classification algorithm, and the resulting algorithm achieves significantly better fairness generalization on several real-world datasets.
Researcher Affiliation	Academia	Zhun Deng Columbia University zd2322@columbia.edu Jiayao Zhang University of Pennsylvania zjiayao@upenn.edu Linjun Zhang Rutgers University lz412@stat.rutgers.edu Ting Ye University of Washington tingye1@uw.edu Yates Coley Kaiser Permanente Washington Health Research Institute & University of Washington Rebecca.Y.Coley@kp.org Weijie J. Su University of Pennsylvania suw@wharton.upenn.edu James Zou Stanford University jamesz@stanford.edu
Pseudocode	Yes	Algorithm 1 FIFA Combined Grid Search
Open Source Code	Yes	Our code is available to the public on Git Hub at https://github.com/zjiayao/ fifa-iclr23.
Open Datasets	Yes	Celeb A ((Liu et al., 2015)): the task is to predict whether the person in the image has blond hair or not where the sensitive attribute is the gender of the person. (ii). Adult Income ((Dua & Graff, 2017)): the task is to predict whether the income is above 50K per year, where the sensitive attribute is the gender. We also use the new Adult Income (from California in 2021) introduced by Ding et al. (2021), where the sensitive attribute is the race. (iii). Dutch Consensus ((voor de Statistiek , Statistics Netherlands)): the task is predict whether an individual has a prestigious occupation and the sensitive attribute is the gender.
Dataset Splits	No	We use the official train-test split for the Celeb A dataset. For Adult Income and Dutch Consensus, we use the train_test_split procedure of the scikit-learn package with trainingtest set ratio of 0.8 and random seed of 1 to generate the training and test set.
Hardware Specification	Yes	We perform all experiments on NVIDIA GPUs RTX 2080 Ti.
Software Dependencies	No	We use the train_test_split procedure of the scikit-learn package... We use the Adam optimizer...
Experiment Setup	Yes	We use the Adam optimizer with learning rate 1e-4 and weight decay 5e-5 to train these models with stochastic batches of sizes 128. We performed pilot experiments and learnt that under this configuration the models usually converges within the first 1500 iterations in terms of training losses and thus we fix the training time as 8000 iterations which corresponds to roughly four epochs.