reproducibilityindex.ai

Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees

Authors: L. Elisa Celis, Lingxiao Huang, Vijay Keswani, Nisheeth K. Vishnoi

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that our framework can be used to attain either statistical rate or false positive rate fairness guarantees with a minimal loss in accuracy, even when the noise is large, in two real-world datasets. We implement our denoised program, for binary and nonbinary protected attributes, and compare the performance with baseline algorithms on real-world datasets.
Researcher Affiliation	Academia	1Department of Statistics and Data Science, Yale University, USA 2Tsinghua University, China 3Department of Computer Science, Yale University, USA.
Pseudocode	No	The paper describes algorithms and programs (e.g., Program Target Fair, Program DFair) but does not provide them in a structured pseudocode or algorithm block format.
Open Source Code	Yes	Code available at github.com/vijaykeswani/Noisy Fair-Classification.
Open Datasets	Yes	We perform simulations on the Adult (Asuncion & Newman, 2007) and COMPAS (Angwin et al., 2016b) benchmark datasets, as pre-processed in AIF360 toolkit (Bellamy et al., 2018b).
Dataset Splits	No	We ﬁrst shufﬂe and partition the dataset into a train and test partition (70-30 split). The paper does not explicitly mention a validation set split.
Hardware Specification	No	The paper describes experimental simulations and comparisons but does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper mentions the 'AIF360 toolkit' and 'SLSQP' solver, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We ﬁrst shufﬂe and partition the dataset into a train and test partition (70-30 split). For binary protected attributes, we use η0 = 0.3 and η1 = 0.1. For non-binary protected attributes, we use the noise matrix [...] For COMPAS, we use λ=0.1 as a large fraction (47%) of training samples have class label 1, while for Adult, we use λ=0 as the fraction of positive class labels is small (24%).