reproducibilityindex.ai

FairGBM: Gradient Boosting with Fairness Constraints

Authors: André Cruz, Catarina G Belém, João Bravo, Pedro Saleiro, Pedro Bizarro

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our method on five large-scale public benchmark datasets, popularly known as folktables datasets, as well as on a real-world financial services case-study. We compare Fair GBM with a set of constrained optimization baselines from the Fair ML literature.
Researcher Affiliation	Collaboration	1Feedzai 2MPI for Intelligent Systems, Tübingen 3UC Irvine
Pseudocode	Yes	Algorithm 1 Fair GBM training pseudocode
Open Source Code	Yes	Our implementation1 shows an order of magnitude speedup in training time relative to related work, a pivotal aspect to foster the widespread adoption of Fair GBM by real-world practitioners. (footnote 1: https://github.com/feedzai/fairgbm)
Open Datasets	Yes	We validate our method on five large-scale public benchmark datasets, popularly known as folktables datasets, as well as on a real-world financial services case-study. The folktables datasets were put forth by Ding et al. (2021) and are derived from the American Community Survey (ACS) public use microdata sample from 2018.
Dataset Splits	Yes	Each task is randomly split in training (60%), validation (20%), and test (20%) data.
Hardware Specification	Yes	ACSIncome and AOF experiments: Intel i7-8650U CPU, 32GB RAM. ACSEmployment, ACSMobility, ACSTravel Time, ACSPublic Coverage experiments: each model trained in parallel on a cluster. Resources per training job: 1 v CPU core (Intel Xeon E5-2695), 8GB RAM3.
Software Dependencies	No	The paper mentions "Light GBM implementation" and its language "C++" and "Python interface" but does not provide specific version numbers for Light GBM, Python, or any other critical libraries/dependencies. The reproducibility checklist points to supplementary materials but does not explicitly list software versions in the text.
Experiment Setup	Yes	To control for the variability of results when selecting different hyperparameters, we randomly sample 100 hyperparameter configurations of each algorithm. In the case of EG and GS, both algorithms already fit n base estimators as part of a single training procedure. Hence, we run 10 trials of EG and GS, each with a budget of n = 10 iterations, for a total budget of 100 models trained (leading to an equal budget for all algorithms).