Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Adjusting Machine Learning Decisions for Equal Opportunity and Counterfactual Fairness

Authors: Yixin Wang, Dhanya Sridhar, David Blei

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the algorithms, and the trade-oﬀbetween accuracy and fairness, on datasets about admissions, income, credit, and recidivism.
Researcher Affiliation	Academia	Yixin Wang EMAIL University of Michigan Dhanya Sridhar EMAIL Mila-Quebec AI Institute and Université de Montréal David M. Blei EMAIL Columbia University
Pseudocode	Yes	Algorithm 1 The eco and cf decision makers (for additive-error models).
Open Source Code	Yes	The supplement provides software that reproduces the studies.
Open Datasets	Yes	We study these approaches on simulated admissions data and on three public datasets, about income, credit, and recidivism. The adult income data (Dua & Graﬀ, 2017a) and the German credit data (Dua & Graﬀ, 2017b) contain data about people and decisions about which are loan worthy. Pro Publica s COMPAS data contains information about criminal defendants and decisions about their recidivism score.
Dataset Splits	No	The paper mentions using 'held-out test set' and 'training data' but does not provide specific percentages, absolute counts, or citations to predefined splits required to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions using 'linear regression models' and 'logistic regression' but does not specify any software libraries or packages with version numbers needed to replicate the experiment.
Experiment Setup	No	While some parameters for simulated data generation are given (e.g., 'We fix the eﬀect of test score on admissions to s = 2.0. We generate multiple datasets by varying the gender bias a and the historical disadvantage on test score .'), the paper does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size) or training configurations for the ML models used in the experiments.