reproducibilityindex.ai

Constructing a Fair Classifier with Generated Fair Data

Authors: Taeuk Jang, Feng Zheng, Xiaoqian Wang7908-7916

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide empirical evidence to demonstrate the beneﬁt of our model with respect to both fairness and accuracy. We empirically validate our model on benchmark fairness datasets and validate that the transfer learning method improves fairness under different fairness metrics while maintaining comparable predictive performance with state-of-the-art methods. In this section, we examine the proposed method on how the generated balanced data and transfer learning affects the fairness and accuracy of the classiﬁer by comparing with state-of-the-art fairness methods.
Researcher Affiliation	Academia	Taeuk Jang1, Feng Zheng2, Xiaoqian Wang1 1School of Electrical and Computer Engineering, Purdue University, West Lafayette, USA, 47907 2Department of Computer Science and Technology, Southern University of Science and Technology, Shenzhen, China, 518055
Pseudocode	Yes	Algorithm 1 Optimization Procedure of Our Method
Open Source Code	No	The paper provides links to datasets used in the experiments (e.g., https://github.com/propublica/compas-analysis, https://meps.ahrq.gov/mepsweb/) and cites UCI repository for others, but it does not explicitly state that the code for their methodology is open-source or provide a link to it.
Open Datasets	Yes	We implement the experiments on the four fairness datasets: Adult: data from the UCI repository (Kohavi 1996): [...] Compas: The dataset includes [...] https://github.com/propublica/compas-analysis German credit data from the UCI repository (Dua and Graff 2019): [...] MEPS: The Medical Expenditure Panel Survey (MEPS) dataset [...] https://meps.ahrq.gov/mepsweb/
Dataset Splits	Yes	Each dataset is randomly split with the ratio of training, validation, and test sets being 70%, 15%, and 15%.
Hardware Specification	Yes	We conducted the experiments on a Quadro RTX 6000 GPU and Intel I9-9960X on Pytorch and Tensorﬂow framework.
Software Dependencies	No	The paper mentions using 'Pytorch and Tensorﬂow framework' but does not specify their version numbers.
Experiment Setup	Yes	In our model, we design the encoder E with three dense layers with layer normalization (Ba, Kiros, and Hinton 2016) and Re LU followed by two residual layers. Decoder F has symmetric structure to the encoder. Discriminator D consists of three dense layers with leaky Re LU activation. The classiﬁer C consists of three dense layers with Re LU and dropout with the probability of 0.7. We ﬁrst train classiﬁer with synthetic data only until validation accuracy converges. After that, we ﬁne-tune the model with the real data and corresponding generated pairs with ﬂipped values in sensitive attribute and target label.