Constructing a Fair Classifier with Generated Fair Data

Authors: Taeuk Jang, Feng Zheng, Xiaoqian Wang7908-7916

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical evidence to demonstrate the benefit of our model with respect to both fairness and accuracy. We empirically validate our model on benchmark fairness datasets and validate that the transfer learning method improves fairness under different fairness metrics while maintaining comparable predictive performance with state-of-the-art methods. In this section, we examine the proposed method on how the generated balanced data and transfer learning affects the fairness and accuracy of the classifier by comparing with state-of-the-art fairness methods.
Researcher Affiliation Academia Taeuk Jang1, Feng Zheng2, Xiaoqian Wang1 1School of Electrical and Computer Engineering, Purdue University, West Lafayette, USA, 47907 2Department of Computer Science and Technology, Southern University of Science and Technology, Shenzhen, China, 518055
Pseudocode Yes Algorithm 1 Optimization Procedure of Our Method
Open Source Code No The paper provides links to datasets used in the experiments (e.g., https://github.com/propublica/compas-analysis, https://meps.ahrq.gov/mepsweb/) and cites UCI repository for others, but it does not explicitly state that the code for their methodology is open-source or provide a link to it.
Open Datasets Yes We implement the experiments on the four fairness datasets: Adult: data from the UCI repository (Kohavi 1996): [...] Compas: The dataset includes [...] https://github.com/propublica/compas-analysis German credit data from the UCI repository (Dua and Graff 2019): [...] MEPS: The Medical Expenditure Panel Survey (MEPS) dataset [...] https://meps.ahrq.gov/mepsweb/
Dataset Splits Yes Each dataset is randomly split with the ratio of training, validation, and test sets being 70%, 15%, and 15%.
Hardware Specification Yes We conducted the experiments on a Quadro RTX 6000 GPU and Intel I9-9960X on Pytorch and Tensorflow framework.
Software Dependencies No The paper mentions using 'Pytorch and Tensorflow framework' but does not specify their version numbers.
Experiment Setup Yes In our model, we design the encoder E with three dense layers with layer normalization (Ba, Kiros, and Hinton 2016) and Re LU followed by two residual layers. Decoder F has symmetric structure to the encoder. Discriminator D consists of three dense layers with leaky Re LU activation. The classifier C consists of three dense layers with Re LU and dropout with the probability of 0.7. We first train classifier with synthetic data only until validation accuracy converges. After that, we fine-tune the model with the real data and corresponding generated pairs with flipped values in sensitive attribute and target label.