reproducibilityindex.ai

Counterfactual Fairness by Combining Factual and Counterfactual Predictions

Authors: Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, David I. Inouye

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic and semi-synthetic datasets demonstrate the validity of our analysis and methods.
Researcher Affiliation	Academia	Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, David I. Inouye Elmore Family School of Electrical and Computer Engineering Purdue University {zhou1059, liu3351, bai116, jinggao, mkocaoglu, dinouye}@purdue.edu
Pseudocode	Yes	Algorithm 1 Plug-in Counterfactual Fairness (PCF)
Open Source Code	Yes	Code can be found in https://github.com/inouye-lab/pcf
Open Datasets	Yes	In this section, we consider Law School Success dataset [Wightman, 1998] where the sensitive attribute is gender and the target is first-year grade. ... To compute TE, we need access to ground truth counterfactuals. Hence we train a generative model on real dataset to generate semi-synthetic dataset following the method in Zuo et al. [2023].
Dataset Splits	No	The paper mentions training, testing, and sometimes evaluating on a subset of data (like the Law School dataset), but it does not explicitly provide specific splits like "80/10/10 split" or sample counts for train/validation/test sets. It states "Given a test set Dtest" for metrics but not how this test set is formed in relation to other splits.
Hardware Specification	Yes	All GPU related experiments are run on RTX A5000.
Software Dependencies	No	In our synthetic experiments, we mainly use KNN based predictors. We use the default parameters in scikit-learn. All MLP methods uses a structure with hidden layer (20, 20) and Tanh activation. In semi-synthetic experiments, we use MLP methods uses a structure with hidden layer (5, 5) and Tanh activation as this is closer to the ground truth SCM. The paper mentions tools like scikit-learn but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	All MLP methods uses a structure with hidden layer (20, 20) and Tanh activation. In semi-synthetic experiments, we use MLP methods uses a structure with hidden layer (5, 5) and Tanh activation as this is closer to the ground truth SCM.