Standardized Interpretable Fairness Measures for Continuous Risk Scores

Authors: Ann-Kristin Becker, Oana Dumitrasc, Klaus Broelemann

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 5 contains results of experiments using benchmark data and Section 6 includes final discussion and outlook.
Researcher Affiliation Industry 1SCHUFA Holding AG, Wiesbaden, Germany. Correspondence to: Ann-Kristin Becker <ann-kristin.becker@schufa.de>, Oana Dumitrasc <oana.dumitrasc@schufa.de>, Klaus Broelemann <klaus.broelemann@schufa.de>.
Pseudocode No The paper contains mathematical definitions, theorems, and proofs but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes The code used for the experiments in this study is online available 5. The repository includes detailed instructions for reproducing the results. 5https://github.com/schufa-innovationlab/fair-scoring
Open Datasets Yes We use the COMPAS dataset2, the Adult dataset3 and the German Credit dataset4 to demonstrate the application of the fairness measures for continuous risk scores. 2https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv 3https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data 4https://www.kaggle.com/datasets/uciml/germancredit?resource=download
Dataset Splits No The paper states: 'Both models have been trained on 70% of the dataset and evaluated on the remaining samples.' and 'All three models have been trained on 70% of the dataset and evaluated on the remaining samples.' This describes a train/test split but does not explicitly mention a separate validation set.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models, or cloud computing instance specifications.
Software Dependencies No The paper mentions software components like 'scipy.wasserstein_distance', 'sklearn.calibration.calibration_curve', and 'sklearn.metrics.roc_curve' in Appendix C.1, but it does not specify their version numbers.
Experiment Setup No The paper describes general aspects of the experiment setup, such as training logistic regression and XGBoost models and data preprocessing (min-max-scaling, one-hot-encoding), but it does not specify concrete hyperparameters like learning rates, batch sizes, or optimizer settings for these models.