Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fair Densities via Boosting the Sufficient Statistics of Exponential Families

Authors: Alexander Soen, Hisham Husain, Richard Nock

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results are present to display the quality of result on real-world data. ... We empirically evaluate our approach with respect to data, prediction, and clustering fairness; and further present an empirical test of FBDE used for continuous domains. ... In this section, we (1) verify that FBDE debiased densities and ad-hears to specified data fairness measures; (2) inspect the implications of utilizing samples produced by a FBDE debiased density in downstream tasks, specifically, prediction and clustering tasks; (3) explore the interpretability of FBDE when utilizing decision tree (DT) weak learners (WLs); and (4) present an experimental on a dataset with continuous X.
Researcher Affiliation	Collaboration	Alexander Soen * 1 2 Hisham Husain * 2 Richard Nock 3 1 *Work complete prior to joining Amazon Work complete prior to joining Google Research 1Australian National University 2Amazon 3Google Research. Correspondence to: Alexander Soen <EMAIL>.
Pseudocode	Yes	Algorithm 1 FBDE (WL, T, τ, QINIT, ϑt) 1: input: Weak learner WL, # iter. T, SR τ, init. QINIT, input dist. P, leverage ϑt; 2: Q0 QINIT (with SR0 > τ) 3: for t = 1, . . . , T do 4: ct WL(P, Qt 1) 5: Qt Qt 1 exp(ϑtct) 6: end for 7: return: QT (for fairness ϑt {ϑE t , ϑR t })
Open Source Code	Yes	Code for FBDE is available at www.github. com/alexandersoen/fbde.
Open Datasets	Yes	To analyze these points, we evaluate FBDE over preprocessed COMPAS (binary S = race) and ADULT (binary S = sex) datasets provided by AIF3602 (Bellamy et al., 2019). ... Public at: www.github.com/Trusted-AI/AIF360 ... The Dutch Census dataset is from a 2001 Netherlands census, where data represents aggregated groups of people. ... The German Credit dataset consists of individual bank holders, with the prediction task being to determine whether or not the grant credit to someone. We utilize the pre-processed version provided by AIF360.
Dataset Splits	Yes	In evaluating all approaches, we utilize 5-fold cross validation and evaluate all measurements using the test set (whenever appropriate).
Hardware Specification	Yes	All training was on a Mac Book Pro (16 GB memory, M1, 2020). ... AS thanks members of the ANU Humanising Machine Intelligence program for discussions on fairness and ethical concerns in AI, and the Ne CTAR Research Cloud for providing computational resources, an Australian research platform supported by the National Collaborative Research Infrastructure Strategy.
Software Dependencies	No	The paper mentions using "scikit-learn" but does not specify its version number. It also mentions "Python" indirectly through common usage, but no specific version for it or other libraries is given.
Experiment Setup	Yes	We consider 4 configurations of FBDE, boosted for T = 32 iterations. We consider a fixed fairness budget τ = 0.8 throughout and take a combination of exact vs relative leverage; and SR0 = 1 vs SR0 = 0.9. We designate each configuration by M-X-Y, where X encodes the leverage and Y encodes the base rate, i.e., M-E-1.0 uses exact leverage with SR0 = 1. DT WLs are calibrated using Platt s method (Platt et al., 1999). For baselines, we consider two different data pre-processing approaches. Firstly, we consider the max entropy approach proposed by Celis et al. (2020) (MAXENT) with default parameters. Secondly, we compare against Tab Fair GAN proposed by Rajabi & Garibay (2022) (TABFAIR), which includes a separate training phase for fairness. In clustering, we consider a R enyi Fair K-means (K = 4) approach as per Baharlouei et al. (2019) (FAIRK).