Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making

Authors: Drago Plecko, Elias Bareinboim

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our method to three real-world datasets and derive new insights on bias amplification in prediction and decision-making.
Researcher Affiliation	Academia	Drago Plecko and Elias Bareinboim Causal Artificial Intelligence Lab Columbia University EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Auditing Weak & Strong Business Necessity
Open Source Code	Yes	The source code for reproducing all the experiments can be found in our Github code repository https://github.com/dplecko/mind-the-gap. The code is also included with the supplementary materials, in the folder source-code.
Open Datasets	Yes	We analyze the MIMIC-IV (Ex. 2), COMPAS (Ex. 3), and Census (Ex. 4, Appendix C) datasets.
Dataset Splits	No	The paper mentions using real-world datasets but does not explicitly provide details about training, validation, or test splits (e.g., percentages or sample counts) for the experiments.
Hardware Specification	Yes	All experiments were performed on a Mac Book Pro, with the M3 Pro chip and 36 GB RAM on mac OS 14.1 (Sonoma).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup	No	The paper describes the context and purpose of the experiments (e.g., decision rules like b Y = 1(S > Quant(0.5; S))), but it does not specify concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for the underlying machine learning models.