Intersectional Unfairness Discovery

Authors: Gezheng Xu, Qi Chen, Charles Ling, Boyu Wang, Changjian Shui

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN.
Researcher Affiliation Academia 1Department of Computer Science, University of Western Ontario 2University of Toronto 3Vector Institute. Correspondence to: Boyu Wang <bwang@csd.uwo.ca>, Changjian Shui <changjian.shui@vectorinstitute.ai>.
Pseudocode Yes Algorithm 1 Bias Guided Generative Network (BGGN)
Open Source Code Yes The Code is available at: https://github.com/ xugezheng/BGGN.
Open Datasets Yes Celeb A (Image) (Liu et al., 2015) A face image dataset containing 200K images. ... Toxic (Text) (Borkan et al., 2019). The main task of this dataset is to predict the toxicity of text comments
Dataset Splits Yes We split the data into Observation (or training) and Holdout datasets, where there is no intersectional sensitive attribute overlap between these two sub-datasets. ... After obtaining this enriched dataset Dbias with bias value, we randomly split it into an Observation set (70%) and a Holdout set (30%) to train the bias value predictor \ Lf(a) and the generator, with NO sensitive attributes overlapping.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided.
Software Dependencies No The paper mentions 'Distil BERT' but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes We train f(x) for 3 epochs with a batch size of 64. We utilize the Adam optimizer and fix the learning rate at 1e-4 for both the backbone model and classifier. ... we train the predictor for 60 epoches using MSE loss and Adam optimizer with a learning rate of 1e-3. ... We first (pre-)train the vanilla generative model for 5 epoches with Adam optimizer and set the learning rate at 1e-3. ... We conducted 500 sampling iterations, with a batch size of 128 for each sampling. ... set a relatively small learning rate, with 2e-5 for the encoder and 1e-5 for the decoder. ... We set the resample number as 10, and the filter proportion as 0.2 on celeb A dataset.