reproducibilityindex.ai

Fair regression with Wasserstein barycenters

Authors: Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments indicate that our method is very effective in learning fair models, with a relative increase in error rate that is inferior to the relative gain in fairness. 5 Empirical study In this section, we present numerical experiments5 with the proposed fair regression estimator deﬁned in Section 3. In all experiments, we collect statistics on the test set T = {(Xi, Si, Yi)}ntest i=1. The empirical mean squared error (MSE) is deﬁned as MSE (g) = 1 ntest (X,S,Y ) T (Y g(X, S))2 . We also measure the violation of fairness constraint imposed by Deﬁnition 2.2 via the empirical Kolmogorov-Smirnov (KS), KS (g) = max s,s S sup t R (X,S,Y ) T s 1{g(X,S) t} 1 \|T s \| (X,S,Y ) T s 1{g(X,S) t} where for all s S we deﬁne the set T s= {(X, S, Y ) T : S=s}. For all datasets we split the data in two parts (70% train and 30% test), this procedure is repeated 30 times, and we report the average performance on the test set alongside its standard deviation. We employ the 2-steps 10-fold CV procedure considered by [17] to select the best hyperparameters with the training set.
Researcher Affiliation	Academia	Evgenii Chzhen LMO, Université Paris-Saclay CNRS, Inria Christophe Denis LAMA, Université Gustave Eiffel MIA-Paris, Agro Paris Tech INRAE, Université Paris-Saclay Mohamed Hebiri LAMA, Université Gustave Eiffel CREST, ENSAE, IP Paris Luca Oneto DIBRIS, University of Genoa Massimiliano Pontil Istituto Italiano di Tecnologia University College London
Pseudocode	Yes	A pseudo-code implementation of ˆg in Eq. (6) is reported in Algorithm 1.
Open Source Code	Yes	The source of our method can be found at https://github.com/lucaoneto/NIPS2020_Fairness.
Open Datasets	Yes	Communities&Crime (CRIME) contains socio-economic, law enforcement, and crime data about communities in the US [37]... Law School (LAW) refers to the Law School Admissions Councils National Longitudinal Bar Passage Study [44]... National Longitudinal Survey of Youth (NLSY) involves survey results by the U.S. Bureau of Labor Statistics that is intended to gather information on the labor market activities and other life events of several groups [8]... Student Performance (STUD), approaches 649 students achievement (ﬁnal grade) in secondary education of two Portuguese schools using 33 attributes [14]...
Dataset Splits	Yes	For all datasets we split the data in two parts (70% train and 30% test), this procedure is repeated 30 times, and we report the average performance on the test set alongside its standard deviation. We employ the 2-steps 10-fold CV procedure considered by [17] to select the best hyperparameters with the training set.
Hardware Specification	No	The paper does not mention any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper refers to types of base estimators like "RLS", "KRLS", and "RF", but it does not specify any software libraries, frameworks, or tools with their version numbers that were used for implementation or experimentation.
Experiment Setup	Yes	The hyperparameters of the methods are set as follows. For RLS we set the regularization hyperparameters λ ∈ 10{-4.5, -3.5, ..., 3} and for KRLS we set λ ∈ 10{-4.5, -3.5, ..., 3} and γ ∈ 10{-4.5, -3.5, ..., 3}. Finally, for RF we set to 1000 the number of trees and for the number of features to select during the tree creation we search in {d^1/4, d^1/2, d^3/4}.