reproducibilityindex.ai

DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks

Authors: Boris van Breugel, Trent Kyono, Jeroen Berrevoets, Mihaela van der Schaar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we show that DECAF successfully removes undesired bias and in contrast to existing methods is capable of generating high-quality synthetic data. Experimentally, we show how DECAF is compatible with several fairness/discrimination deﬁnitions used in literature while still maintaining high downstream utility of generated data.
Researcher Affiliation	Academia	Boris van Breugel University of Cambridge bv292@cam.ac.uk; Trent Kyono University of California, Los Angeles tmkyono@ucla.edu; Jeroen Berrevoets University of Cambridge jb2384@cam.ac.uk; Mihaela van der Schaar University of Cambridge University of California, Los Angeles The Alan Turing Institute mv472@cam.ac.uk
Pseudocode	No	The paper describes the method DECAF in detail in Section 5 and illustrates its architecture in Figure 2, but it does not include any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Py Torch Lightning source code at https://github.com/vanderschaarlab/DECAF.
Open Datasets	Yes	We experiment on the Adult dataset [40], with known bias between gender and income [10, 11]. We use the Credit Approval dataset from [40]. [40] Dheeru Dua and Casey Graff. UCI machine learning repository, 2020. URL http://archive.ics.uci.edu/ml.
Dataset Splits	No	The paper mentions training data and models but does not provide specific details on training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) in the main body or referenced appendices.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Py Torch Lightning' and 'Tetrad [42]' but does not provide specific version numbers for these or any other software dependencies. It only cites the year for Tetrad's release.
Experiment Setup	Yes	For the MLP, we use a single hidden layer of size 100 with ReLU activation and Adam optimizer (learning rate 0.001, betas (0.9, 0.999), epsilon 1e-08, weight decay 0).