Individual Arbitrariness and Group Fairness
Authors: Carol Long, Hsiang Hsu, Wael Alghamdi, Flavio Calmon
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present empirical results to show that arbitrariness is masked by favorable group-fairness and accuracy metrics for multiple fairness intervention methods, baseline models, and datasets 7. We also demonstrate the effectiveness of the ensemble in reducing the predictive multiplicity of fair models. |
| Researcher Affiliation | Academia | John A. Paulson School of Engineering and Applied Sciences, Harvard University, Boston, MA 02134. Emails: carol_long@g.harvard.edu, alghamdi@g.harvard.edu, flavio@seas.harvard.edu. |
| Pseudocode | No | The paper describes methods in paragraph text and does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code can be found at https://github.com/Carol-Long/Fairness_and_Arbitrariness |
| Open Datasets | Yes | We report predictive multiplicity and benchmark the ensemble method on three datasets two datasets in the education domain: the high-school longitudinal study (HSLS) dataset [27, 28] and the ENEM dataset [16] (see Alghamdi et al. [2] Appendix B.1), and the UCI Adult dataset[33] which is based on the US census income data. |
| Dataset Splits | Yes | First, split the data into training, validation, and test dataset. ... We use the validation set to measure \epsilon corresponding to this empirical Rashomon Set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like Scikit-learn, AIF360 toolkits, and PANDAS package, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For logistic regression and gradient boosting, the default hyperparameter is used; for random forest, we set the number of trees and minimum number of samples per leaf to 10 to prevent over-fitting. To get 10 competing models for each hypothesis class, we use 10 random seeds (specifically 33 42). |