Bounding the fairness and accuracy of classifiers from population statistics

Authors: Sivan Sabato, Elad Yom-Tov

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose an efficient and practical procedure for finding the best possible lower bound on the discrepancy of the classifier, given the aggregate statistics, and demonstrate in experiments the empirical tightness of this lower bound, as well as its possible uses on various types of problems, ranging from estimating the quality of voting polls to measuring the effectiveness of patient identification from internet search queries.
Researcher Affiliation Collaboration Sivan Sabato 1 2 and Elad Yom-Tov 2 1Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel 2Microsoft Research, Herzelia, Israel.. Correspondence to: Sivan Sabato <sabatos@cs.bgu.ac.il>, Elad Yom-Tov <eladyt@microsoft.com>.
Pseudocode Yes Algorithm 1 Finding a lower bound for discrepancyβ Input: Ins ({wg}g G, {(πy g, ˆpy g)}g G,y Y), β [0, 1], tolerance γ > 0 Output: A value V [V , V + γ], where V (see Eq. (8)) is the discβ lower-bound; The values of unfairness and error that obtain V .
Open Source Code Yes The code and data are available at https://github.com/ sivansabato/bfa.
Open Datasets Yes In the first experiment, we used the UC Census (1990) data set (Dua & Graff, 2019) to generate hundreds of classifiers...
Dataset Splits Yes We split this data into two halves at random, using one half as a training set to generate classifiers, and the other half as a test set to calculate the aggregate statistics of the classifier, as well as its true unfairness and error.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, or cloud computing instance types.
Software Dependencies No The paper mentions using a 'standard Matlab package' for linear regression and that 'Matlab code for Alg. 1' is available, but it does not specify exact version numbers for Matlab or any other software dependencies.
Experiment Setup No The paper describes the general setup for generating classifiers and analyzing results (e.g., using linear regression), but it does not provide specific hyperparameter values, optimization settings, or detailed training configurations for the models used in the experiments.