reproducibilityindex.ai

FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods

Authors: Xiaotian Han, Jianfeng Chi, Yu Chen, Qifan Wang, Han Zhao, Na Zou, Xia Hu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This work offers the following key contributions: ... extensive benchmarking, which yields key insights from 45, 079 experiments, 14, 428 GPU hours.
Researcher Affiliation	Collaboration	1Texas A&M University 2Meta AI 3Anytime AI 4UIUC 5University of Houston 6Rice University
Pseudocode	Yes	Algorithm 1 Adv Debias in AIF360, Algorithm 2 Adv Debias in Fair Learn, Algorithm 3 Adv Debias in FFB
Open Source Code	Yes	The benchmark is available at https://github.com/ahxt/fair_fairness_benchmark.
Open Datasets	Yes	Adult (Kohavi & Becker, 1996)... German (Dua & Graff, 2017)... KDDCensus (Dua & Graff, 2017)... COMPAS (Larson et al., 2016)... Bank (Dua & Graff, 2017)... ACS-I/E/P/M/T (Ding et al., 2021)... Celeb A-A/W/S (Liu et al., 2015)... UTKFace (Zhang et al., 2017)... Jigsaw (Jigsaw, 2018)... The dataset loading codes are at this url.
Dataset Splits	Yes	We also split the data into training and test sets with random seeds. We use the training set to train the model and the test set to evaluate the model s performance. ... The results are based on 10 trials with varying data splits and training seeds, to ensure reliable outcomes.
Hardware Specification	No	The paper mentions '14, 428 GPU hours' but does not specify any particular GPU models, CPU models, or other hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions various software components like 'AIF360 (Bellamy et al., 2018)', 'Fair Learn (Bird et al., 2020)', 'Scikit-learn (Pedregosa et al., 2011)', 'Pytorch-style (Paszke et al., 2019)', and 'Adam (Diederik P. Kingma, 2014)', but no specific version numbers are provided for any of these.
Experiment Setup	Yes	For tabular datasets, we use a two-layer Multi-layer Perceptron with 256 neurons each for all datasets. We use Adam (Diederik P. Kingma, 2014) as the optimizer with a learning rate of 0.001 for both tabular and image data. ... We employ a linear decay strategy for the learning rate, halving it every 50 training steps. The model training is stopped when the learning rate decreases to a value below 1e 5. Table 5: Common Hyper-parameters. Table 6: The fairness control hyperparameter selections. Table 7: The batch size for different datasets during the training.