Fairness without the Sensitive Attribute via Causal Variational Autoencoder

Authors: Vincent Grari, Sylvain Lamprier, Marcin Detyniecki

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed method empirically achieves significant improvement over existing works in the field. We observe that the generated proxy s latent space correctly recovers sensitive information and that our approach achieves a higher accuracy while obtaining the same level of fairness on two real datasets. For our experiments, we empirically evaluate the performance of our contribution on real-world data sets where the sensitive s is available.
Researcher Affiliation Collaboration Sorbonne Universit e, CNRS, ISIR, F-75005 Paris, France 2AXA, Paris, France 3 Polish Academy of Science, IBS PAN, Warsaw, Poland
Pseudocode No The paper describes the methodology in prose and uses diagrams to illustrate architectures, but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes For this purpose, we use the popular Adult UCI and Default datasets (descriptions in Appendix), often used in fair classification.
Dataset Splits No For the two datasets, we test different models where, for each, we repeat five runs by randomly sampling two subsets, 80% for the training set and 20% for the test set. The paper specifies train and test splits but does not explicitly mention a separate validation set.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., specific GPU or CPU models, memory, or cloud computing instances).
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) needed to replicate the experiments.
Experiment Setup Yes K different representations are sampled for each observation (xci, xdi, yi) from the causal model (200 in our experiments). The hyperparameter λDP controls the impact of dependence between the output prediction hθ(x) p(y = 1|xd, xc) and the sensitive proxy z. We consider a version of our model trained without the penalization term (λinf = 0.00) as a baseline. It is then compared to a version trained with a penalization term equal to 0.20. For the unfair model (leftmost graph) we observe that the convergence is stable and achieves a P-rule of 29.5%. As expected, the penalization loss decreases (measured with the HGR) when the hyperparameter λDP is increased. It allows to increase the fairness metric P-rule to 83.1% with a slight drop of accuracy. In Figure 5 we plot the distribution of the predicted probabilities for each sensitive attribute s for three different models: an unfair model with λDP = 0, and two fair models with λDP = 0.45 and 0.50, respectively.