Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adjusting Machine Learning Decisions for Equal Opportunity and Counterfactual Fairness

Authors: Yixin Wang, Dhanya Sridhar, David Blei

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the algorithms, and the trade-offbetween accuracy and fairness, on datasets about admissions, income, credit, and recidivism.
Researcher Affiliation Academia Yixin Wang EMAIL University of Michigan Dhanya Sridhar EMAIL Mila-Quebec AI Institute and Université de Montréal David M. Blei EMAIL Columbia University
Pseudocode Yes Algorithm 1 The eco and cf decision makers (for additive-error models).
Open Source Code Yes The supplement provides software that reproduces the studies.
Open Datasets Yes We study these approaches on simulated admissions data and on three public datasets, about income, credit, and recidivism. The adult income data (Dua & Graff, 2017a) and the German credit data (Dua & Graff, 2017b) contain data about people and decisions about which are loan worthy. Pro Publica s COMPAS data contains information about criminal defendants and decisions about their recidivism score.
Dataset Splits No The paper mentions using 'held-out test set' and 'training data' but does not provide specific percentages, absolute counts, or citations to predefined splits required to reproduce the data partitioning.
Hardware Specification No The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies No The paper mentions using 'linear regression models' and 'logistic regression' but does not specify any software libraries or packages with version numbers needed to replicate the experiment.
Experiment Setup No While some parameters for simulated data generation are given (e.g., 'We fix the effect of test score on admissions to s = 2.0. We generate multiple datasets by varying the gender bias a and the historical disadvantage on test score .'), the paper does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size) or training configurations for the ML models used in the experiments.