Bounding and Approximating Intersectional Fairness through Marginal Fairness

Authors: Mathieu Molina, Patrick Loiseau

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, our primary goal is to understand in detail the relationship between marginal and intersectional fairness through statistical analysis. We first identify a set of sufficient conditions under which an exact relationship can be obtained. Then, we prove bounds (easily computable through marginal fairness and other meaningful statistical quantities) in highprobability on intersectional fairness in the general case. Beyond their descriptive value, we show that these theoretical bounds can be leveraged to derive a heuristic improving the approximation and bounds of intersectional fairness by choosing, in a relevant manner, protected attributes for which we describe intersectional subgroups. Finally, we test the performance of our approximations and bounds on real and synthetic data-sets.
Researcher Affiliation Academia Mathieu Molina Inria Fair Play team 91120 Palaiseau, France mathieu.molina@inria.fr Patrick Loiseau Inria Fair Play team 91120 Palaiseau, France patrick.loiseau@inria.fr
Pseudocode Yes Algorithm 1 Greedy Partition Finder
Open Source Code Yes All the code used in our experiments can be found in the supplementary material or at https://github.com/mathieu-molina/Bound Approx Inter Marg Fairness.
Open Datasets Yes We used US Census data from 1990 [11]
Dataset Splits No The paper mentions sampling from datasets and fixing the number of samples for experiments but does not explicitly describe training, validation, and test splits with percentages, absolute counts, or citations to predefined splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions training a 'Random Forest binary classifier' and code being available, but it does not specify any software versions for libraries or environments (e.g., Python version, scikit-learn version, TensorFlow/PyTorch version).
Experiment Setup Yes We then train a Random Forest binary classifier on a poverty binary label, where we weight the labels differently so as to obtain about the same number of predictions for each outcome. We fix a number of available samples n from 100 to 2,000, and we sample without replacement from the datasets. For each subset and each sample size n, we sample from Di and Pi 20 times for each fixed number of samples n. We will always take δ=0.1 when relevant. The choice of τ the count threshold for grouping always gives reasonable approximations, with τ = 1 being close to u B, and τ big makes it close to u I.