reproducibilityindex.ai

Auditing Fairness by Betting

Authors: Ben Chugg, Santiago Cortes-Gomez, Bryan Wilder, Aaditya Ramdas

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the efficacy of our approach on three benchmark fairness datasets.
Researcher Affiliation	Academia	Departments of Machine Learning1 and Statistics2 Carnegie Mellon University {benchugg, scortesg, bwilder, aramdas}@cmu.edu
Pseudocode	Yes	Algorithm 1 Testing group fairness by betting
Open Source Code	Yes	All code is publicly available at https://github.com/bchugg/auditing-fairness.
Open Datasets	Yes	We demonstrate the efficacy of our approach on three benchmark fairness datasets: credit default data, US census data, and insurance data. We used a vanilla random forest for both the credit default dataset and the US census data. We interface with the census data by means the folktables package [51]. We use US health insurance data [52], which is synthetic data generated based on US census information.
Dataset Splits	No	The paper discusses fixed-time tests that use 'a batch of n iid observations' but does not specify train/validation/test splits with percentages or counts for their own experiments in the main text. It focuses on sequential testing where data is received continuously.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU models, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using 'vanilla random forest' and 'logistic regression' models, and 'the folktables package [51]', but it does not specify any software names with version numbers for libraries, frameworks, or programming languages (e.g., Python version, scikit-learn version, TensorFlow/PyTorch version).
Experiment Setup	No	The paper mentions using 'a vanilla random forest' and 'a logistic regression model' for the experiments and briefly discusses how models are made fair or unfair by manipulating means. However, it does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or other system-level training configurations.