Multigroup Robustness

Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 7, we empirically demonstrate that several standard models for classification fail to preserve multigroup robustness under simple label-flipping and data addition attacks on the Adult Income Dataset. In Section 7, we supplement our theoretical results with experiments on real-world census datasets demonstrating that our post-processing approach can be added to existing learning algorithms to provide multigroup robustness protections without a drop in accuracy.
Researcher Affiliation Academia Lunjia Hu* 1 Charlotte Peale* 1 Judy Hanwen Shen* 1 1Stanford University, USA. Correspondence to: Charlotte Peale <cpeale@stanford.edu>.
Pseudocode Yes Algorithm 1 Multiaccuracy Boost on Empirical Distribution
Open Source Code Yes Code to replicate experiments can be found at: https://github.com/heyyjudes/multigroup-robust
Open Datasets Yes Due to the multigroup focus of our work, we examine several standard fairness datasets including Folktables-Income, Employment, Public Coverage (Ding et al., 2021), Bank (Moro & Cortez, 2012), and Law School (Sander, 2004) 2.
Dataset Splits No the γ threshold is optimized on the entire held-out validation set. We measure all of these results on a test set while both training and post-process with Algorithm 1 are done on the training set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions machine learning models and libraries (e.g., scikit-learn in a footnote), but does not provide specific version numbers for the software dependencies used in their experiments.
Experiment Setup No The paper mentions hyperparameter search for some models (e.g., 'hyperparameter search over the learning rate and l2 regularization weight' for MLP, and 'parameter search from 3, 5, and 7 nearest neighbors' for k-NN), but it does not explicitly provide the chosen specific values for these hyperparameters or other detailed training configurations.