Blind Justice: Fairness with Encrypted Sensitive Attributes

Authors: Niki Kilbertus, Adria Gascon, Matt Kusner, Michael Veale, Krishna Gummadi, Adrian Weller

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we show how to overcome both doubts and that fair training, certification and verification are feasible for realistic datasets. 5.1. Experimental Setup and Datasets We work with two separate code bases. Our Python code... The full MPC protocol is implemented in C++... We consider 5 real world datasets, namely the adult (Adult), German credit (German), and bank market (Bank) datasets from the UCI machine learning repository (Lichman, 2013), the stop, question and frisk 2012 dataset (SQF),3 and the COMPAS dataset (Angwin et al., 2016) (COMPAS). For practical purposes (see Section 4), we subsample 2i examples from each dataset with the largest possible i, see Table 1. Moreover, we also run on synthetic data...
Researcher Affiliation Academia Niki Kilbertus 1 2 Adri a Gasc on 3 4 Matt Kusner 3 4 Michael Veale 5 Krishna P. Gummadi 6 Adrian Weller 2 3 1Max Planck Institute for Intelligent Systems 2University of Cambridge 3The Alan Turing Institute 4University of Warwick 5University College London 6Max Planck Institute for Software Systems.
Pseudocode Yes Algorithm 1 in Section B in the appendix describes the computations M and REG have to run for fair model training using the Lagrangian multiplier technique and the p%-rule from eq. (9).
Open Source Code Yes Code is available at https://github.com/nikikilbertus/blind-justice
Open Datasets Yes We consider 5 real world datasets, namely the adult (Adult), German credit (German), and bank market (Bank) datasets from the UCI machine learning repository (Lichman, 2013), the stop, question and frisk 2012 dataset (SQF),3 and the COMPAS dataset (Angwin et al., 2016) (COMPAS).
Dataset Splits No The paper mentions subsampling data and evaluating 'test set accuracy' but does not provide specific details on the dataset splits (e.g., percentages for train/validation/test, specific sample counts, or the methodology used for splitting).
Hardware Specification No The paper mentions that experiments were run 'on a laptop computer' in Section 5.3, but no specific hardware details such as CPU, GPU models, or memory specifications are provided.
Software Dependencies No The paper states that the MPC protocol is implemented in 'C++ on top of the Obliv-C garbled circuits framework (Zahur & Evans, 2015a) and the Absentminded Crypto Kit (lib)', but it does not provide specific version numbers for these software dependencies.
Experiment Setup Yes Table 1 lists 'training over 10 epochs with batch size 64'. Section 5.2 mentions running methods 'for a range of constraint values in [10 4, 100]'.