reproducibilityindex.ai

Unlocking Fairness: a Trade-off Revisited

Authors: Michael Wick, swetasudha panda, Jean-Baptiste Tristan

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate fairness and accuracy, but this time under a variety of controlled conditions in which we vary the amount and type of bias. We ﬁnd, under reasonable assumptions, that the tension between fairness and accuracy is illusive, and vanishes as soon as we account for these phenomena during evaluation. Moreover, our results are consistent with an opposing conclusion: fairness and accuracy are sometimes in accord. This raises the question, might there be a way to harness fairness to improve accuracy after all? Since many notions of fairness are with respect to the model s predictions and not the ground truth labels, this provides an opportunity to see if we can improve accuracy by harnessing appropriate notions of fairness over large quantities of unlabeled data with techniques like posterior regularization and generalized expectation. We ﬁnd that semi-supervision improves both accuracy and fairness while imparting beneﬁcial properties of the unlabeled data on the classiﬁer.
Researcher Affiliation	Industry	Michael Wick, Swetasudha Panda, Jean-Baptiste Tristan {michael.wick,swetasudha.panda,jean.baptiste.tristan}@oracle.com Oracle Labs, Burlington, MA.
Pseudocode	No	The paper describes methods with mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	Yes	For data of type (b) we begin with the COMPAS data, treat the two-year recidivism labels as the unbiased ground-truth z and then apply our model of label bias to produce the biased labels y g(z\|y, , x, β) [15].
Dataset Splits	Yes	We always report the mean and standard error of these various metrics computed over ten experiments with ten randomly generated datasets (or in the case of COMPAS, ten random splits).
Hardware Specification	No	The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions that the method is easy to implement in existing toolkits such as Scikit-Learn, Py Torch or Tensor Flow, but it does not specify which software or libraries, along with their version numbers, were used for the experiments.
Experiment Setup	No	The paper describes general experimental conditions (varying bias, evaluating classiﬁers) but does not provide specific details on hyperparameters, model initialization, or training configurations.