reproducibilityindex.ai

Does enforcing fairness mitigate biases caused by subpopulation shift?

Authors: Subha Maity, Debarghya Mukherjee, Mikhail Yurochkin, Yuekai Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also illustrate the practical implications of our theoretical results in simulations and on real data. ... We verify the theoretical ﬁndings of the paper empirically. Our goal is to show that an algorithm trained with fairness constraints on the biased train data e P achieves superior performance on the true data generating P at test time in comparison to an algorithm trained without fairness considerations. ... Simulations. We ﬁrst verify the implications of Corollary 4.4 using simulation studies. ... Recidivism prediction on COMPAS data. We verify that our theoretical ﬁndings continue to apply on real data.
Researcher Affiliation	Collaboration	Subha Maity* Department of Statistics University of Michigan smaity@umich.edu Debarghya Mukherjee* Department of Statistics University of Michigan mdeb@umich.edu Mikhail Yurochkin IBM Research MIT-IBM Watson AI Lab mikhail.yurochkin@ibm.com Yuekai Sun Department of Statistics University of Michigan yuekai@umich.edu
Pseudocode	No	The paper describes algorithms and methods but does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code	Yes	We refer to Reductions algorithm trained with loose EO violation constraint as baseline and Reductions trained with tight EO violation constraint as fair classiﬁer (please see Appendix 3 for additional details and supplementary material for the code).
Open Datasets	Yes	We train baseline and fair classiﬁer on COMPAS dataset [3]. ... We present results for the same experimental setup on the Adult dataset [4] in Table 2 in Appendix C.
Dataset Splits	No	For the COMPAS data, the paper states: 'each time splitting the data into identically distributed 70-30 train-test split'. This mentions a train and test split but does not explicitly specify a validation split.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies	No	The paper mentions software components like 'Reductions fair classiﬁcation algorithm' and 'logistic regression' but does not specify any version numbers for these or other software dependencies.
Experiment Setup	Yes	In our experiments we use Reductions fair classiﬁcation algorithm [1] with logistic regression as the base classiﬁer. For the fairness constraint we consider Equalized Odds [14] (EO) one of the major and more nuanced fairness deﬁnitions. We refer to Reductions algorithm trained with loose EO violation constraint as baseline and Reductions trained with tight EO violation constraint as fair classiﬁer (please see Appendix 3 for additional details and supplementary material for the code). ... Speciﬁcally, consider a binary classiﬁcation problem with two protected groups, i.e. Y {0, 1} and A {0, 1}. We set P to have equal representation of protected groups conditioned on the label and biased data e P to have one of the protected groups underrepresented. Speciﬁcally, let pay = PA=a,Y =y, i.e. the a, y indexed element of PA,Y ; pay = 0.25 a, y for P and p1y = pminor, p0y = pmajor = 0.5 pminor for e P. For both P and e P we ﬁx class marginals p 0 = p 1 = 0.5 and generate Gaussian features X\|A = a, Y = y N(µay, Σay) in 2-dimensions (see additional data generating details in Appendix C). ... There are two binary protected attributes, Gender (male and female) and Race (white and non-white), resulting in 4 protected groups A {0, 1, 2, 3}. The task is to predict if a defendant will reoffend, i.e. Y {0, 1}. We repeat the experiment 100 times, each time splitting the data into identically distributed 70-30 train-test split