reproducibilityindex.ai

Bounding the fairness and accuracy of classifiers from population statistics

Authors: Sivan Sabato, Elad Yom-Tov

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose an efﬁcient and practical procedure for ﬁnding the best possible lower bound on the discrepancy of the classiﬁer, given the aggregate statistics, and demonstrate in experiments the empirical tightness of this lower bound, as well as its possible uses on various types of problems, ranging from estimating the quality of voting polls to measuring the effectiveness of patient identiﬁcation from internet search queries.
Researcher Affiliation	Collaboration	Sivan Sabato 1 2 and Elad Yom-Tov 2 1Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel 2Microsoft Research, Herzelia, Israel.. Correspondence to: Sivan Sabato <sabatos@cs.bgu.ac.il>, Elad Yom-Tov <eladyt@microsoft.com>.
Pseudocode	Yes	Algorithm 1 Finding a lower bound for discrepancyβ Input: Ins ({wg}g G, {(πy g, ˆpy g)}g G,y Y), β [0, 1], tolerance γ > 0 Output: A value V [V , V + γ], where V (see Eq. (8)) is the discβ lower-bound; The values of unfairness and error that obtain V .
Open Source Code	Yes	The code and data are available at https://github.com/ sivansabato/bfa.
Open Datasets	Yes	In the ﬁrst experiment, we used the UC Census (1990) data set (Dua & Graff, 2019) to generate hundreds of classiﬁers...
Dataset Splits	Yes	We split this data into two halves at random, using one half as a training set to generate classiﬁers, and the other half as a test set to calculate the aggregate statistics of the classiﬁer, as well as its true unfairness and error.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, or cloud computing instance types.
Software Dependencies	No	The paper mentions using a 'standard Matlab package' for linear regression and that 'Matlab code for Alg. 1' is available, but it does not specify exact version numbers for Matlab or any other software dependencies.
Experiment Setup	No	The paper describes the general setup for generating classifiers and analyzing results (e.g., using linear regression), but it does not provide specific hyperparameter values, optimization settings, or detailed training configurations for the models used in the experiments.