reproducibilityindex.ai

FRAPPÉ: A Group Fairness Framework for Post-Processing Everything

Authors: Alexandru Tifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, Flavien Prost

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods.
Researcher Affiliation	Collaboration	1Department of Computer Science, ETH Zurich 2Google Deep Mind. Correspondence to: Alexandru T ifrea <alexandru.tifrea@inf.ethz.ch>, Flavien Prost <fprost@google.com>.
Pseudocode	No	The paper describes its methods through text and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	1Code is available at https://github.com/ google-research/google-research/tree/ master/postproc_fairness.
Open Datasets	Yes	We conduct experiments on standard datasets for assessing fairness mitigation techniques, namely Adult (Becker & Kohavi, 1996) and COMPAS (Angwin et al., 2016), as well as two recently proposed datasets: the highschool longitudinal study (HSLS) dataset (Jeong et al., 2022), and ENEM (Alghamdi et al., 2022). We also evaluate FRAPP E on data with continuous sensitive attributes (i.e. the Communities & Crime dataset (Redmond, 2009)).
Dataset Splits	Yes	We adopt the standard practice in the literature, and select essential hyperparameters such as the learning rate so as to minimize prediction error on a holdout validation set, for all the baselines in our experiments. We select the optimal learning rate by minimizing the prediction error on a held-out validation set.
Hardware Specification	Yes	The machine we used for these measurements has 32 1.5 GHz CPUs.
Software Dependencies	No	The paper mentions some software or toolkits like 'tensorflow-model-remediation' but does not provide specific version numbers for the software dependencies used in its implementation, such as Python or core libraries.
Experiment Setup	Yes	We adopt the standard practice in the literature, and select essential hyperparameters such as the learning rate so as to minimize prediction error on a holdout validation set, for all the baselines in our experiments. We use a 1-MLP with 64 hidden units to model the post-processing transformation. The optimal learning rate and early-stopping epoch are selected so as to minimize prediction error on a held-out validation set.