Fair regression via plug-in estimator and recalibration with statistical guarantees

Authors: Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we present numerical experiments illustrating that the proposed method is often superior or competitive with state-of-the-art methods.
Researcher Affiliation Academia Evgenii Chzhen LMO, Université Paris-Saclay CNRS, Inria Christophe Denis LAMA, Université Gustave Eiffel MIA-Paris, Agro Paris Tech INRAE, Université Paris-Saclay Mohamed Hebiri LAMA, Université Gustave Eiffel CREST, ENSAE, IP Paris Luca Oneto DIBRIS, University of Genoa Massimiliano Pontil Istituto Italiano di Tecnologia University College London
Pseudocode Yes Algorithm 1 Smoothed accelerated gradient descent
Open Source Code Yes The source of our method can be found at https://github.com/lucaoneto/NIPS2020_Fairness.
Open Datasets Yes We consider five benchmark datasets, CRIME, LAW, NLSY, STUD, and UNIV, which are briefly described below: Communities&Crime (CRIME) contains socio-economic, law enforcement, and crime data about communities in the US [50]... Law School (LAW) refers to the Law School Admissions Councils National Longitudinal Bar Passage Study [56]... National Longitudinal Survey of Youth (NLSY) involves survey results by the U.S. Bureau of Labor Statistics... Student Performance (STUD), approaches 649 students achievement (final grade) in secondary education of two Portuguese schools using 33 attributes [19]...
Dataset Splits Yes For all datasets we split the data in two parts (70% train and 30% test), this procedure is repeated 30 times, and we report the average performance on the test set alongside its standard deviation. We employ the 2-steps 10-fold CV procedure considered by [22] to select the best hyperparameters with the training set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for running experiments.
Software Dependencies No No specific software dependencies with version numbers are mentioned.
Experiment Setup Yes As our theory suggests that L = N 1/4 leads to a statistically grounded approach, we choose L {6, 12, 24} since the size of the considered datasets is smaller than 244 3 105 and β {0.1, 0.01}. For RLS we set the regularization hyperparameters λ 10{ 4.5, 3.5, ,3} and for KRLS we set λ 10{ 4.5, 3.5, ,3} and γ 10{ 4.5, 3.5, ,3}. Finally, for RF we set to 1000 the number of trees and for the number of features to select during the tree creation we search in {d 1/4, d 1/2, d 3/4}.