Differentially Private Worst-group Risk Minimization
Authors: Xinyu Zhou, Raef Bassily
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We initiate a systematic study of worst-group risk minimization under (ϵ, δ)-differential privacy (DP). The goal is to privately find a model that approximately minimizes the maximal risk across p sub-populations (groups) with different distributions, where each group distribution is accessed via a sample oracle. We first present a new algorithm that achieves excess worst-group population risk of O( p d K ), where K is the total number of samples drawn from all groups and d is the problem dimension. Our rate is nearly optimal when each distribution is observed via a fixedsize dataset of size K/p. Our result is based on a new stability-based analysis for the generalization error. In particular, we show that -uniform argument stability implies O( + 1 n) generalization error w.r.t. the worst-group risk, where n is the number of samples drawn from each sample oracle. Next, we propose an algorithmic framework for worst-group population risk minimization using any DP online convex optimization algorithm as a subroutine. Hence, we give another excess risk bound of O q ϵK + p p Kϵ2 + p p Assuming the typical setting of ϵ = Θ(1), this bound is more favorable than our first bound in a certain range of p as a function of K and d. Finally, we study differentially private worst-group empirical risk minimization in the offline setting, where each group distribution is observed by a fixed-size dataset. We present a new algorithm with nearly optimal excess risk of O( p d K ). |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, The Ohio State University 2Department of Computer Science & Engineering and the Translational Data Analytics Institute (TDAI), The Ohio State University. |
| Pseudocode | Yes | Algorithm 1 Minimax Phased ERM Algorithm 2 Worst-group risk minimization using DP-OCO algorithm Algorithm 3 Noisy SGD with Multiplicative Group Reweighting (Noisy-SGD-MGR) Algorithm 4 Minimax Optimization for SC-SC Objective |
| Open Source Code | No | No statement or link to open-source code is provided in the paper. |
| Open Datasets | No | The paper is theoretical and describes a model that accesses data via |
| Dataset Splits | No | The paper is theoretical and does not describe empirical dataset splits for validation, training, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not list specific software dependencies with version numbers for reproducibility of experiments. It references algorithms like DP-FTRL but not specific software environments. |
| Experiment Setup | No | The paper is theoretical and does not describe an empirical experimental setup with concrete hyperparameters or training configurations. |