FRAPPÉ: A Group Fairness Framework for Post-Processing Everything
Authors: Alexandru Tifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, Flavien Prost
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, ETH Zurich 2Google Deep Mind. Correspondence to: Alexandru T ifrea <alexandru.tifrea@inf.ethz.ch>, Flavien Prost <fprost@google.com>. |
| Pseudocode | No | The paper describes its methods through text and mathematical equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code is available at https://github.com/ google-research/google-research/tree/ master/postproc_fairness. |
| Open Datasets | Yes | We conduct experiments on standard datasets for assessing fairness mitigation techniques, namely Adult (Becker & Kohavi, 1996) and COMPAS (Angwin et al., 2016), as well as two recently proposed datasets: the highschool longitudinal study (HSLS) dataset (Jeong et al., 2022), and ENEM (Alghamdi et al., 2022). We also evaluate FRAPP E on data with continuous sensitive attributes (i.e. the Communities & Crime dataset (Redmond, 2009)). |
| Dataset Splits | Yes | We adopt the standard practice in the literature, and select essential hyperparameters such as the learning rate so as to minimize prediction error on a holdout validation set, for all the baselines in our experiments. We select the optimal learning rate by minimizing the prediction error on a held-out validation set. |
| Hardware Specification | Yes | The machine we used for these measurements has 32 1.5 GHz CPUs. |
| Software Dependencies | No | The paper mentions some software or toolkits like 'tensorflow-model-remediation' but does not provide specific version numbers for the software dependencies used in its implementation, such as Python or core libraries. |
| Experiment Setup | Yes | We adopt the standard practice in the literature, and select essential hyperparameters such as the learning rate so as to minimize prediction error on a holdout validation set, for all the baselines in our experiments. We use a 1-MLP with 64 hidden units to model the post-processing transformation. The optimal learning rate and early-stopping epoch are selected so as to minimize prediction error on a held-out validation set. |