Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Scalable and Stable Surrogates for Flexible Classifiers with Fairness Constraints
Authors: Henry C Bendekgey, Erik Sudderth
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our surrogates perform comparably to the state-of-the-art on low-dimensional fairness benchmarks, while achieving superior accuracy and stability for more complex computer vision and natural language processing tasks.We compare relaxations by training fairness-regularized logistic regression on tabular data, and fairness-regularized deep neural networks on image and text data. |
| Researcher Affiliation | Academia | Harry Bendekgey EMAIL Erik B. Sudderth EMAIL Department of Computer Science, University of California Irvine School of Information and Computer Science, Irvine, CA, USA |
| Pseudocode | No | The paper describes mathematical formulations and algorithmic steps in prose and equations but does not contain a formally labeled "Pseudocode" or "Algorithm" block. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Adult. The Adult data set [36] is one of the most popular in the fair classification literature...,COMPAS. The COMPAS data set was compiled by Angwin et al. [2]...,Celeb A. The Celeb Faces Attributes data set [39]...,Faces of the World (FOTW) data set [20]...,Yelp Text Data. We use a subset of the Yelp review data set [13]... |
| Dataset Splits | No | The paper mentions "training data" and "test data" in figures and text but does not explicitly provide specific train/validation/test split percentages, sample counts, or detailed methodology for creating these splits in the main body. It states "Methods for hyperparameter selection, data pre-processing, and optimization are detailed in the supplement", which might contain this information, but it's not present in the main text. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like 'WRN-50-2' and 'Google’s transformer-based BERT' but does not specify their version numbers or the versions of other ancillary software (e.g., deep learning frameworks, programming languages) used in the experiments. |
| Experiment Setup | No | The paper explicitly states: 'Methods for hyperparameter selection, data pre-processing, and optimization are detailed in the supplement.' This indicates that specific experimental setup details are not provided in the main text. |