Bounding the fairness and accuracy of classifiers from population statistics
Authors: Sivan Sabato, Elad Yom-Tov
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose an efficient and practical procedure for finding the best possible lower bound on the discrepancy of the classifier, given the aggregate statistics, and demonstrate in experiments the empirical tightness of this lower bound, as well as its possible uses on various types of problems, ranging from estimating the quality of voting polls to measuring the effectiveness of patient identification from internet search queries. |
| Researcher Affiliation | Collaboration | Sivan Sabato 1 2 and Elad Yom-Tov 2 1Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel 2Microsoft Research, Herzelia, Israel.. Correspondence to: Sivan Sabato <sabatos@cs.bgu.ac.il>, Elad Yom-Tov <eladyt@microsoft.com>. |
| Pseudocode | Yes | Algorithm 1 Finding a lower bound for discrepancyβ Input: Ins ({wg}g G, {(πy g, ˆpy g)}g G,y Y), β [0, 1], tolerance γ > 0 Output: A value V [V , V + γ], where V (see Eq. (8)) is the discβ lower-bound; The values of unfairness and error that obtain V . |
| Open Source Code | Yes | The code and data are available at https://github.com/ sivansabato/bfa. |
| Open Datasets | Yes | In the first experiment, we used the UC Census (1990) data set (Dua & Graff, 2019) to generate hundreds of classifiers... |
| Dataset Splits | Yes | We split this data into two halves at random, using one half as a training set to generate classifiers, and the other half as a test set to calculate the aggregate statistics of the classifier, as well as its true unfairness and error. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions using a 'standard Matlab package' for linear regression and that 'Matlab code for Alg. 1' is available, but it does not specify exact version numbers for Matlab or any other software dependencies. |
| Experiment Setup | No | The paper describes the general setup for generating classifiers and analyzing results (e.g., using linear regression), but it does not provide specific hyperparameter values, optimization settings, or detailed training configurations for the models used in the experiments. |