Optimal Transport of Classifiers to Fairness
Authors: Maarten Buyl, Tijl De Bie
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information. |
| Researcher Affiliation | Academia | Maarten Buyl Ghent University maarten.buyl@ugent.be Tijl De Bie Ghent University tijl.debie@ugent.be |
| Pseudocode | No | The paper contains mathematical formulations and propositions but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and a pipeline to run it are included in the supplemental material. |
| Open Datasets | Yes | Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information. First, the UCI Adult Census Income dataset with two binary sensitive attributes SEX and RACE. Second, the UCI Bank Marketing dataset, where we measure fairness with respect to the original continuous AGE values and the binary, quantized version AGE_BINNED. Third, the Dutch Census dataset with sensitive attribute SEX. Fourth, the Diabetes dataset with sensitive attribute GENDER. |
| Dataset Splits | No | Each configuration of each method (i.e. each α value and fairness notion) was tested for 10 train-test splits with proportions 80/20. |
| Hardware Specification | Yes | All experiments were conducted using half the hyperthreads on an internal machine equipped with a 12 Core Intel(R) Xeon(R) Gold processor and 256 GB of RAM. |
| Software Dependencies | No | The paper mentions using a 'logistic regression classifier' and 'cross-entropy loss' but does not specify any software names with version numbers, such as programming languages, libraries, or frameworks (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | All experiments were performed using a logistic regression classifier. To achieve fairness, we jointly optimize fairness regularizers with the cross-entropy loss as in Eq. (13), and compute the gradient of their sum with respect to the parameters of the classifier. The OTF cost is the adjusted version from Def. 5, with ϵ = 10 3 (different choices for ϵ are illustrated in Appendix B.2). For the cost function c, we use the Euclidean distance between non-protected features. Additional information on datasets and hyperparameters is given in Appendix C.2 and C.3 respectively. |