Fair regression via plug-in estimator and recalibration with statistical guarantees
Authors: Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present numerical experiments illustrating that the proposed method is often superior or competitive with state-of-the-art methods. |
| Researcher Affiliation | Academia | Evgenii Chzhen LMO, Université Paris-Saclay CNRS, Inria Christophe Denis LAMA, Université Gustave Eiffel MIA-Paris, Agro Paris Tech INRAE, Université Paris-Saclay Mohamed Hebiri LAMA, Université Gustave Eiffel CREST, ENSAE, IP Paris Luca Oneto DIBRIS, University of Genoa Massimiliano Pontil Istituto Italiano di Tecnologia University College London |
| Pseudocode | Yes | Algorithm 1 Smoothed accelerated gradient descent |
| Open Source Code | Yes | The source of our method can be found at https://github.com/lucaoneto/NIPS2020_Fairness. |
| Open Datasets | Yes | We consider five benchmark datasets, CRIME, LAW, NLSY, STUD, and UNIV, which are briefly described below: Communities&Crime (CRIME) contains socio-economic, law enforcement, and crime data about communities in the US [50]... Law School (LAW) refers to the Law School Admissions Councils National Longitudinal Bar Passage Study [56]... National Longitudinal Survey of Youth (NLSY) involves survey results by the U.S. Bureau of Labor Statistics... Student Performance (STUD), approaches 649 students achievement (final grade) in secondary education of two Portuguese schools using 33 attributes [19]... |
| Dataset Splits | Yes | For all datasets we split the data in two parts (70% train and 30% test), this procedure is repeated 30 times, and we report the average performance on the test set alongside its standard deviation. We employ the 2-steps 10-fold CV procedure considered by [22] to select the best hyperparameters with the training set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for running experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned. |
| Experiment Setup | Yes | As our theory suggests that L = N 1/4 leads to a statistically grounded approach, we choose L {6, 12, 24} since the size of the considered datasets is smaller than 244 3 105 and β {0.1, 0.01}. For RLS we set the regularization hyperparameters λ 10{ 4.5, 3.5, ,3} and for KRLS we set λ 10{ 4.5, 3.5, ,3} and γ 10{ 4.5, 3.5, ,3}. Finally, for RF we set to 1000 the number of trees and for the number of features to select during the tree creation we search in {d 1/4, d 1/2, d 3/4}. |