Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making

Authors: Drago Plecko, Elias Bareinboim

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our method to three real-world datasets and derive new insights on bias amplification in prediction and decision-making.
Researcher Affiliation Academia Drago Plecko and Elias Bareinboim Causal Artificial Intelligence Lab Columbia University EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Auditing Weak & Strong Business Necessity
Open Source Code Yes The source code for reproducing all the experiments can be found in our Github code repository https://github.com/dplecko/mind-the-gap. The code is also included with the supplementary materials, in the folder source-code.
Open Datasets Yes We analyze the MIMIC-IV (Ex. 2), COMPAS (Ex. 3), and Census (Ex. 4, Appendix C) datasets.
Dataset Splits No The paper mentions using real-world datasets but does not explicitly provide details about training, validation, or test splits (e.g., percentages or sample counts) for the experiments.
Hardware Specification Yes All experiments were performed on a Mac Book Pro, with the M3 Pro chip and 36 GB RAM on mac OS 14.1 (Sonoma).
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch, scikit-learn versions).
Experiment Setup No The paper describes the context and purpose of the experiments (e.g., decision rules like b Y = 1(S > Quant(0.5; S))), but it does not specify concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for the underlying machine learning models.