FARE: Provably Fair Representation Learning with Practical Certificates
Authors: Nikola Jovanović, Mislav Balunovic, Dimitar Iliev Dimitrov, Martin Vechev
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our comprehensive experimental evaluation, we demonstrate that FARE produces practical certificates that are tight and often even comparable with purely empirical results obtained by prior methods, which establishes the practical value of our approach. |
| Researcher Affiliation | Academia | Nikola Jovanovi c 1 Mislav Balunovi c 1 Dimitar I. Dimitrov 1 Martin Vechev 1 ... 1Department of Computer Science, ETH Zurich. |
| Pseudocode | No | The paper describes procedures and derivations in prose and mathematical notation but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The implementation of FARE is publicly available at https://github.com/eth-sri/fare. |
| Open Datasets | Yes | We consider common fairness datasets: Health (Kaggle, 2012), ACSIncome-CA (only California), and ACSIncome-US (US-wide) (Ding et al., 2021). |
| Dataset Splits | Yes | a set D of datapoints {(x(j), s(j))} from X is split into a training set Dtrain, used to train f, validation set Dval, held-out for the upperbounding procedure (and not used in training of f in any capacity), and a test set Dtest, used to evaluate the empirical accuracy and fairness of downstream classifiers. |
| Hardware Specification | Yes | We use a single core of the i9-7900X CPU Intel CPU that has clock speed of 3.30GHz. All methods were given a single NVIDIA 1080 Ti GPU with 12 GB of VRAM, except FARE which does not require a GPU. |
| Software Dependencies | No | The paper mentions hardware and operating systems but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For FARE, there are four hyperparameters: γ (used for the criterion, where larger γ puts more focus on fairness), k (upper bound for the number of leaves), ni (lower bound for the number of examples in a leaf), and v (the ratio of the training set to be used as a validation set). ... In our experiments we investigate γ [0, 1], k [2, 200], ni [50, 1000], v {0.1, 0.2, 0.3, 0.5}. For the upper-bounding procedure, we always set ϵ = 0.05, ϵb = ϵs = 0.005, and thus ϵc = 0.04. Finally, when sorting categorical features as described in Section 6, we use q {1, 2, 4} in all cases. |