Fairness without Demographics through Knowledge Distillation

Authors: Junyi Chai, Taeuk Jang, Xiaoqian Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three datasets show that our method outperforms state-of-the-art alternatives, with notable improvements in group fairness and with relatively small decrease in accuracy.
Researcher Affiliation Academia Junyi Chai, Taeuk Jang, Xiaoqian Wang Elmore Family School of Electrical and Computer Engineering Purdue University West Lafayette, IN 47906 {chai28,jang141,joywang}@purdue.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets Yes New Adult: The Adult reconstruction dataset (Ding et al., 2021) contains 49,531 samples with 14 attributes. COMPAS: The COMPAS dataset (Larson et al., 2016) contains 7,215 samples with 11 attributes. Following previous works on fairness (Zafar et al., 2017), we only select black and white defendants in COMPAS dataset, and the modified dataset contains 6,150 samples. The goal is to predict whether a defendant reoffends within two years, and we choose sex and race as sensitive attributes. Celeb A: The Celeb A dataset (Liu et al., 2015) contains 202,599 face images, each of resolution 178 x 218, with 40 binary attributes.
Dataset Splits Yes To avoid large discrepancies in testing data, before each repetition, we randomly spilt data into 50% training data, 10% validation data and 40% test data.
Hardware Specification Yes We implement our method in Py Torch 1.10.1 with one NVIDIA RTX-3090 GPU.
Software Dependencies Yes We implement our method in Py Torch 1.10.1 with one NVIDIA RTX-3090 GPU.
Experiment Setup Yes We build the teacher model using Res Net-152 (He et al., 2016) and student model using Res Net-18 (He et al., 2016). For student model trained on softmax label, the temperature is tuned to find the best validation accuracy. The hyperparameters of comparing methods are tuned with binary search to find global minimum, as suggested in the original paper (Hashimoto et al., 2018).