reproducibilityindex.ai

Optimal Transport of Classifiers to Fairness

Authors: Maarten Buyl, Tijl De Bie

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information.
Researcher Affiliation	Academia	Maarten Buyl Ghent University maarten.buyl@ugent.be Tijl De Bie Ghent University tijl.debie@ugent.be
Pseudocode	No	The paper contains mathematical formulations and propositions but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code and a pipeline to run it are included in the supplemental material.
Open Datasets	Yes	Our experiments were performed on four datasets from fair binary classification literature [14, 28] with different kinds of sensitive information. First, the UCI Adult Census Income dataset with two binary sensitive attributes SEX and RACE. Second, the UCI Bank Marketing dataset, where we measure fairness with respect to the original continuous AGE values and the binary, quantized version AGE_BINNED. Third, the Dutch Census dataset with sensitive attribute SEX. Fourth, the Diabetes dataset with sensitive attribute GENDER.
Dataset Splits	No	Each configuration of each method (i.e. each α value and fairness notion) was tested for 10 train-test splits with proportions 80/20.
Hardware Specification	Yes	All experiments were conducted using half the hyperthreads on an internal machine equipped with a 12 Core Intel(R) Xeon(R) Gold processor and 256 GB of RAM.
Software Dependencies	No	The paper mentions using a 'logistic regression classifier' and 'cross-entropy loss' but does not specify any software names with version numbers, such as programming languages, libraries, or frameworks (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	All experiments were performed using a logistic regression classifier. To achieve fairness, we jointly optimize fairness regularizers with the cross-entropy loss as in Eq. (13), and compute the gradient of their sum with respect to the parameters of the classifier. The OTF cost is the adjusted version from Def. 5, with ϵ = 10 3 (different choices for ϵ are illustrated in Appendix B.2). For the cost function c, we use the Euclidean distance between non-protected features. Additional information on datasets and hyperparameters is given in Appendix C.2 and C.3 respectively.