Towards Debiasing DNN Models from Spurious Feature Influence

Authors: Mengnan Du, Ruixiang Tang, Weijie Fu, Xia Hu9521-9528

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments In this section, we conduct experiments to evaluate the effectiveness of the proposed De FI framework. Benchmark Datasets. We use three tabular datasets and one synthetic text dataset. The statistics are given in Tab. 1.
Researcher Affiliation Academia Mengnan Du1, Ruixiang Tang2, Weijie Fu3, Xia Hu2 1Texas A&M University 2Rice University 3Hefei University of Technology dumengnan@tamu.edu, {rt39,xia.hu}@rice.edu, fwj.edu@gmail.com
Pseudocode No The paper describes its framework and training procedures in narrative text and mathematical equations, but it does not include explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes Benchmark Datasets. We use three tabular datasets and one synthetic text dataset. The statistics are given in Tab. 1. The first one is Adult Income (Adult), which aims to predict whether a salary is greater than or less than 50K (Kohavi 1996). The second one is Medical Expenditure (MEPS). MEPS is a medical dataset aiming to predict whether a person would have high utilization (Bellamy et al. 2018). The third is COMPAS, which aims to predict criminal defendant s likelihood of reoffending (Angwin et al. 2016). The fourth one is Equity Evaluation Corpus (EEC), which is used to predict the sentiment of texts (Kiritchenko and Mohammad 2018).
Dataset Splits Yes Table 1: Dataset Statistics # Training instances 31600 11080 3700 2940 # Validation instances 4520 1482 523 420 # Test instances 9102 3168 1055 840
Hardware Specification No The paper describes the DNN architectures and training parameters but does not specify any particular hardware components like CPU or GPU models used for running the experiments.
Software Dependencies No The paper mentions the use of 'Adam optimizer' and 'word2vec word embedding' but does not provide specific version numbers for these or any other software libraries or frameworks used in the implementation.
Experiment Setup Yes Implementation Details. For EEC dataset, we use the 300-dimensional word2vec word embedding (Mikolov et al. 2013) to initialize the embedding layer of the CNN model. The hyperparameter m for Integrated Gradient in Eq.(2) is fixed as 50 for all experiments. The influence weight α in Eq.(5) is set as 0.01, 0.06, 0.03, 0.001 for Adult, MEPS, COMPAS, ECC, respectively. To train the DNN models, we use the Adam optimizer, and the learning rate is searched from {5e-5, 1e-4, 5e-4, 1e-3, 5e-3}. Note that hyper-parameters (β1, β2) and other hyper-parameters are tuned based on the trade-off between accuracy and fairness metrics on the validation sets.