reproducibilityindex.ai

Intersectional Unfairness Discovery

Authors: Gezheng Xu, Qi Chen, Charles Ling, Boyu Wang, Changjian Shui

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN.
Researcher Affiliation	Academia	1Department of Computer Science, University of Western Ontario 2University of Toronto 3Vector Institute. Correspondence to: Boyu Wang <bwang@csd.uwo.ca>, Changjian Shui <changjian.shui@vectorinstitute.ai>.
Pseudocode	Yes	Algorithm 1 Bias Guided Generative Network (BGGN)
Open Source Code	Yes	The Code is available at: https://github.com/ xugezheng/BGGN.
Open Datasets	Yes	Celeb A (Image) (Liu et al., 2015) A face image dataset containing 200K images. ... Toxic (Text) (Borkan et al., 2019). The main task of this dataset is to predict the toxicity of text comments
Dataset Splits	Yes	We split the data into Observation (or training) and Holdout datasets, where there is no intersectional sensitive attribute overlap between these two sub-datasets. ... After obtaining this enriched dataset Dbias with bias value, we randomly split it into an Observation set (70%) and a Holdout set (30%) to train the bias value predictor \ Lf(a) and the generator, with NO sensitive attributes overlapping.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided.
Software Dependencies	No	The paper mentions 'Distil BERT' but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	We train f(x) for 3 epochs with a batch size of 64. We utilize the Adam optimizer and fix the learning rate at 1e-4 for both the backbone model and classifier. ... we train the predictor for 60 epoches using MSE loss and Adam optimizer with a learning rate of 1e-3. ... We first (pre-)train the vanilla generative model for 5 epoches with Adam optimizer and set the learning rate at 1e-3. ... We conducted 500 sampling iterations, with a batch size of 128 for each sampling. ... set a relatively small learning rate, with 2e-5 for the encoder and 1e-5 for the decoder. ... We set the resample number as 10, and the filter proportion as 0.2 on celeb A dataset.