How Far Can Fairness Constraints Help Recover From Biased Data?
Authors: Mohit Sharma, Amit Deshpande
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose a general approach to extend the result of Blum & Stangl (2019) to different fairness constraints, data bias models, data distributions, and hypothesis classes. We strengthen their result, and extend it to the case when their stylized distribution has labels with Massart noise instead of i.i.d. noise. We prove a similar recovery result for arbitrary data distributions using fair reject option classifiers. We further generalize it to arbitrary data distributions and arbitrary hypothesis classes, i.e., we prove that for any data distribution, if the optimally accurate classifier in a given hypothesis class is fair and robust, then it can be recovered through fair classification with equal opportunity constraints on the biased distribution whenever the bias parameters satisfy certain simple conditions. |
| Researcher Affiliation | Collaboration | Work done during internship at Microsoft Research India. 1Indraprastha Institute of Information Technology, Delhi, India 2Microsoft Research India. Correspondence to: Mohit Sharma <mohits@iiitd.ac.in>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It focuses on theoretical derivations and proofs. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. There is no mention of code release, repository links, or code in supplementary materials. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments on specific datasets. It refers to 'stylized data distribution' or 'arbitrary data distributions' for theoretical analysis rather than for empirical evaluation with publicly accessible data. |
| Dataset Splits | No | The paper is theoretical and does not present empirical experiments. Therefore, it does not provide specific dataset split information for training, validation, or testing. |
| Hardware Specification | No | The paper is purely theoretical and does not describe any experimental setup or hardware used for computation. |
| Software Dependencies | No | The paper is purely theoretical and does not describe any software implementation details or dependencies with version numbers. |
| Experiment Setup | No | The paper is purely theoretical and does not describe any experimental setup details, concrete hyperparameter values, or training configurations. |