reproducibilityindex.ai

Flexibly Fair Representation Learning by Disentanglement

Authors: Elliot Creager, David Madras, Joern-Henrik Jacobsen, Marissa Weis, Kevin Swersky, Toniann Pitassi, Richard Zemel

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirically that the resulting encoder which does not require the sensitive attributes for inference enables the adaptation of a single representation to a variety of fair classiﬁcation tasks with new target labels and subgroup deﬁnitions. We ﬁrst provide proof-of-concept by generating a variant of the synthetic DSprites dataset... We then apply our method to a real-world tabular dataset (Communities & Crime) and an image dataset (Celeb-A), where we ﬁnd that our method matches or exceeds the fairness-accuracy tradeoff of existing disentangled representation learning approaches on a majority of the evaluated subgroups.
Researcher Affiliation	Collaboration	1University of Toronto 2Vector Institute 3University of T ubingen 4Google Research.
Pseudocode	No	The paper describes the methods but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	DSprites dataset4 contains 64 64-pixel images of white shapes against a black background, and was designed to evaluate whether learned representations have disentangled sources of variation. (footnote 4: https://github.com/deepmind/dsprites-dataset) Communities & Crime5 is a tabular UCI dataset containing neighborhood-level population statistics. (footnote 5: http://archive.ics.uci.edu/ml/datasets/communities+ and+crime) The Celeb A6 dataset contains over 200, 000 images of celebrity faces. (footnote 6: http://mmlab.ie.cuhk.edu.hk/projects/Celeb A.html)
Dataset Splits	No	The paper mentions splitting data into a 'training set' and 'audit set' and later 'test set', but does not provide specific percentages, counts, or detailed methodology for train/validation/test splits, nor does it explicitly define a validation set.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	No	The paper mentions hyperparameters like beta and gamma, and refers to 'training details' in Appendix D (which is not provided in the given text), but does not explicitly list specific hyperparameter values (e.g., learning rate, batch size) or detailed training configurations in the main body.