Achievable Fairness on Your Data With Utility Guarantees

Authors: Muhammad Faaiz Taufiq, Jean-Francois Ton, Yang Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments spanning tabular (e.g., Adult), image (Celeb A), and language (Jigsaw) datasets underscore that our approach not only reliably quantifies the optimum achievable trade-offs across various data modalities but also helps detect suboptimality in SOTA fairness methods.
Researcher Affiliation Collaboration Muhammad Faaiz Taufiq Byte Dance Research faaiz.taufiq@bytedance.com Jean-François Ton Byte Dance Research jeanfrancois@bytedance.com Yang Liu University of California Santa Cruz yangliu@ucsc.edu
Pseudocode Yes Algorithm 1 Bootstrapping for estimating ϵ(h) := Φfair(h) g Φfair(h)
Open Source Code Yes The code to reproduce our experiments is provided at github.com/faaiz T/Dataset Fairness.
Open Datasets Yes These datasets range from tabular (Adult and COMPAS ), to image-based (Celeb A), and natural language processing datasets (Jigsaw). [7], [5], [28], [20]
Dataset Splits Yes Specifically, we assume access to a held-out calibration dataset Dcal := {(Xi, Ai, Yi)}i which is disjoint from the training data. ... obtained using a 10% data split as calibration dataset Dcal. ... with early stopping based on validation losses.
Hardware Specification Yes Training these simple models takes roughly 5 minutes on a Tesla-V100-SXM2-32GB GPU. ... Training this model takes roughly 1.5 hours on a Tesla-V100-SXM2-32GB GPU. ... Training this model takes roughly 6 hours on a Tesla-V100-SXM2-32GB GPU.
Software Dependencies No The paper mentions software components like 'BERT architecture [13]' and 'Feature-wise Linear Modulation (Fi LM) mechanism', but it does not specify version numbers for these or other programming languages or libraries (e.g., Python version, PyTorch version, etc.).
Experiment Setup Yes We train the model for a maximum of 1000 epochs, with early stopping based on validation losses. ... we sample the parameter λ from a distribution Pλ. ... we use the log-uniform distribution as per [15] as the sampling distribution Pλ, where the uniform distribution is U[10 6, 10]. ... we follow in the footsteps of [15] to use Feature-wise Linear Modulation (Fi LM) [34] layers.