Everybody Needs Good Neighbours: An Unsupervised Locality-based Method for Bias Mitigation
Authors: Xudong Han, Timothy Baldwin, Trevor Cohn
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results over five datasets, spanning natural language processing and structured data classification tasks, show that our technique recovers proxy labels that correlate with unknown demographic data, and that our method outperforms all unsupervised baselines, while also achieving competitive performance with state-of-the-art supervised methods which are given access to demographic labels. |
| Researcher Affiliation | Collaboration | Xudong Han1,2 Timothy Baldwin1,2 Trevor Cohn1 1The University of Melbourne 2Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) xudongh1@student.unimelb.edu.au, {tbaldwin,t.cohn}@unimelb.edu.au |
| Pseudocode | No | The paper presents an overview of ULPL in Figure 1 with a diagram and descriptions, but no formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | All baseline experiments are conducted with the Fair Lib library (Han et al., 2022b). Source code is available at https://github.com/Han Xudong/An_Unsupervised_ Locality-based_Method_for_Bias_Mitigation |
| Open Datasets | Yes | We consider the following benchmark datasets1 from the fairness literature: (1) Moji (Blodgett et al., 2016; Elazar & Goldberg, 2018), sentiment analysis with protected attribute race; (2) Bios (De-Arteaga et al., 2019; Subramanian et al., 2021), biography classification with protected attributes gender and economy; (3) Trust Pilot (Hovy et al., 2015), product rating prediction with protected attributes age, gender, and country; (4) COMPAS (Flores et al., 2016), recidivism prediction with protected attributes gender and race; and (5) Adult (Kohavi, 1996), income prediction with protected attributes gender and race. |
| Dataset Splits | Yes | Following Ravfogel et al. (2020), we randomly split the dataset into train (65%), dev (10%), and test (25%). |
| Hardware Specification | Yes | We conduct our experiments on an HPC cluster instance with 4 CPU cores, 32GB RAM, and one NVIDIA V100 GPU. |
| Software Dependencies | Yes | optimizer Adam (Kingma & Ba, 2015) |
| Experiment Setup | Yes | Hyperparameters are tuned using grid-search, in order to minimize distance to the optimal. ... batch size loguniform-integer[64, 2048] 1024 1024 1024 512 1024 ... learning rate loguniform-float[10 6, 10 1] 3 10 5 10 5 3 10 5 3 10 4 10 4 |