Optimal Transport of Classifiers to Fairness

Authors: Maarten Buyl, Tijl De Bie

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information.
Researcher Affiliation Academia Maarten Buyl Ghent University maarten.buyl@ugent.be Tijl De Bie Ghent University tijl.debie@ugent.be
Pseudocode No The paper contains mathematical formulations and propositions but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code and a pipeline to run it are included in the supplemental material.
Open Datasets Yes Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information. First, the UCI Adult Census Income dataset with two binary sensitive attributes SEX and RACE. Second, the UCI Bank Marketing dataset, where we measure fairness with respect to the original continuous AGE values and the binary, quantized version AGE_BINNED. Third, the Dutch Census dataset with sensitive attribute SEX. Fourth, the Diabetes dataset with sensitive attribute GENDER.
Dataset Splits No Each configuration of each method (i.e. each α value and fairness notion) was tested for 10 train-test splits with proportions 80/20.
Hardware Specification Yes All experiments were conducted using half the hyperthreads on an internal machine equipped with a 12 Core Intel(R) Xeon(R) Gold processor and 256 GB of RAM.
Software Dependencies No The paper mentions using a 'logistic regression classifier' and 'cross-entropy loss' but does not specify any software names with version numbers, such as programming languages, libraries, or frameworks (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes All experiments were performed using a logistic regression classifier. To achieve fairness, we jointly optimize fairness regularizers with the cross-entropy loss as in Eq. (13), and compute the gradient of their sum with respect to the parameters of the classifier. The OTF cost is the adjusted version from Def. 5, with ϵ = 10 3 (different choices for ϵ are illustrated in Appendix B.2). For the cost function c, we use the Euclidean distance between non-protected features. Additional information on datasets and hyperparameters is given in Appendix C.2 and C.3 respectively.