Unlocking Fairness: a Trade-off Revisited
Authors: Michael Wick, swetasudha panda, Jean-Baptiste Tristan
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We investigate fairness and accuracy, but this time under a variety of controlled conditions in which we vary the amount and type of bias. We find, under reasonable assumptions, that the tension between fairness and accuracy is illusive, and vanishes as soon as we account for these phenomena during evaluation. Moreover, our results are consistent with an opposing conclusion: fairness and accuracy are sometimes in accord. This raises the question, might there be a way to harness fairness to improve accuracy after all? Since many notions of fairness are with respect to the model s predictions and not the ground truth labels, this provides an opportunity to see if we can improve accuracy by harnessing appropriate notions of fairness over large quantities of unlabeled data with techniques like posterior regularization and generalized expectation. We find that semi-supervision improves both accuracy and fairness while imparting beneficial properties of the unlabeled data on the classifier. |
| Researcher Affiliation | Industry | Michael Wick, Swetasudha Panda, Jean-Baptiste Tristan {michael.wick,swetasudha.panda,jean.baptiste.tristan}@oracle.com Oracle Labs, Burlington, MA. |
| Pseudocode | No | The paper describes methods with mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | Yes | For data of type (b) we begin with the COMPAS data, treat the two-year recidivism labels as the unbiased ground-truth z and then apply our model of label bias to produce the biased labels y g(z|y, , x, β) [15]. |
| Dataset Splits | Yes | We always report the mean and standard error of these various metrics computed over ten experiments with ten randomly generated datasets (or in the case of COMPAS, ten random splits). |
| Hardware Specification | No | The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions that the method is easy to implement in existing toolkits such as Scikit-Learn, Py Torch or Tensor Flow, but it does not specify which software or libraries, along with their version numbers, were used for the experiments. |
| Experiment Setup | No | The paper describes general experimental conditions (varying bias, evaluating classifiers) but does not provide specific details on hyperparameters, model initialization, or training configurations. |