Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multigroup Robustness
Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 7, we empirically demonstrate that several standard models for classification fail to preserve multigroup robustness under simple label-flipping and data addition attacks on the Adult Income Dataset. In Section 7, we supplement our theoretical results with experiments on real-world census datasets demonstrating that our post-processing approach can be added to existing learning algorithms to provide multigroup robustness protections without a drop in accuracy. |
| Researcher Affiliation | Academia | Lunjia Hu* 1 Charlotte Peale* 1 Judy Hanwen Shen* 1 1Stanford University, USA. Correspondence to: Charlotte Peale <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Multiaccuracy Boost on Empirical Distribution |
| Open Source Code | Yes | Code to replicate experiments can be found at: https://github.com/heyyjudes/multigroup-robust |
| Open Datasets | Yes | Due to the multigroup focus of our work, we examine several standard fairness datasets including Folktables-Income, Employment, Public Coverage (Ding et al., 2021), Bank (Moro & Cortez, 2012), and Law School (Sander, 2004) 2. |
| Dataset Splits | No | the γ threshold is optimized on the entire held-out validation set. We measure all of these results on a test set while both training and post-process with Algorithm 1 are done on the training set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions machine learning models and libraries (e.g., scikit-learn in a footnote), but does not provide specific version numbers for the software dependencies used in their experiments. |
| Experiment Setup | No | The paper mentions hyperparameter search for some models (e.g., 'hyperparameter search over the learning rate and l2 regularization weight' for MLP, and 'parameter search from 3, 5, and 7 nearest neighbors' for k-NN), but it does not explicitly provide the chosen specific values for these hyperparameters or other detailed training configurations. |