Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Optimizing Generalized Rate Metrics with Three Players
Authors: Harikrishna Narasimhan, Andrew Cotter, Maya Gupta
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on different fairness tasks confirm the efficacy of our approach. |
| Researcher Affiliation | Industry | Google Research 1600 Amphitheatre Pkwy, Mountain View, CA 94043 EMAIL |
| Pseudocode | Yes | Algorithm 1 Oracle-based Optimizer Initialize: λ0 for t = 0 to T 1 do... |
| Open Source Code | Yes | All implementations are in Tensorflow.2 See Appendix G for additional details. 2https://github.com/google-research/google-research/tree/master/generalized_rates |
| Open Datasets | Yes | We use five datasets: (1) COMPAS, where the goal is to predict recidivism with gender as the protected attribute [44]; (2) Communities & Crime, where the goal is to predict if a community in the US has a crime rate above the 70th percentile [45], and we consider communities having a black population above the 50th percentile as protected [27]; (3) Law School, where the task is to predict whether a law school student will pass the bar exam, with race (black or other) as the protected attribute [46]; (4) Adult, where the task is to predict if a person s income exceeds 50K/year, with gender as the protected attribute [45]; (5) Wiki Toxicity, where the goal is to predict if a comment posted on a Wikipedia talk page contains non-toxic/acceptable content, with the comments containing the term gay considered as a protected group [47]. |
| Dataset Splits | No | The paper mentions using datasets but does not explicitly provide training, validation, and test split percentages or counts in the main text. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions 'All implementations are in Tensorflow' but does not specify a version number or other software dependencies with versions. |
| Experiment Setup | No | The paper mentions using linear models and hinge losses as surrogates, and specifies the objectives and constraints for the fairness tasks, but does not provide specific hyperparameters or detailed training configurations like learning rates, batch sizes, or optimizers. |