reproducibilityindex.ai

Characterizing Fairness Over the Set of Good Models Under Selective Labels

Authors: Amanda Coston, Ashesh Rambachan, Alexandra Chouldechova

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate these use cases on a recidivism prediction task and a real-world credit-scoring task. ... Contributions: We ... (4) use Fai RS to audit the COMPAS risk assessment, ﬁnding that it generates larger predictive disparities between black and white defendants than any model in the set of good models [ 6]; and (5) use Fai RS on a selectively labelled credit-scoring dataset to build a model with lower predictive disparities than a benchmark model [ 7].
Researcher Affiliation	Academia	1Heinz College and Machine Learning Department, Carnegie Mellon University 2Department of Economics, Harvard University 3Heinz College, Carnegie Mellon Univesrity.
Pseudocode	Yes	Algorithm 1: Reject inference by extrapolation (RIE) for the selective labels setting ... Algorithm 2: Interpolation and extrapolation (IE) method for the selective labels setting
Open Source Code	No	The paper does not contain an explicit statement offering open-source code for the methodology or a link to a code repository.
Open Datasets	Yes	We use Fai RS to empirically characterize the range of disparities over the set of good models in a recidivism risk prediction task applied to Pro Publica s COMPAS data (Angwin et al., 2016).
Dataset Splits	Yes	We split the data 50%-50% into a train and test set.
Hardware Specification	No	The paper does not mention any specific hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies	No	The paper mentions methods like "random forests" and "logistic regression" and refers to "the fairlearn package", but it does not specify exact version numbers for any software libraries or dependencies.
Experiment Setup	Yes	We analyze the range of predictive disparities ... among logistic regression models on a quadratic polynomial of the defendant s age and number of prior offenses whose training loss is near-comparable to COMPAS (loss tolerance ϵ = 1% of COMPAS training loss). ... As benchmark models, we use the loss-minimizing linear models learned using KGB, RIE, and IE approaches, whose respective training losses are used to select the corresponding loss tolerances ϵ. We use the class of linear models for the Fai RS algorithm for KGB, RIE, and IE approaches.