Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Authors: Hilal Asi, Daogao Liu, Kevin Tian

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kth-moment bound on the Lipschitz constants of sample functions, rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G2 1 n + Gk ( k under (ε, δ)-approximate differential privacy, up to a mild polylog( 1 δ ) factor, where G2 2 and Gk k are the 2nd and kth moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [LR23]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models. ... Can we evaluate the algorithm s performance through numerical simulations or real-world datasets? We leave these questions for future research.
Researcher Affiliation Collaboration Hilal Asi Apple Inc. hilal.asi94@gmail.com Daogao Liu University of Washington liudaogao@gmail.com Kevin Tian University of Texas at Austin kjtian@cs.utexas.edu
Pseudocode Yes Algorithm 1: Clipped-DP-SGD(D, C, λ, {ηt}t [T ], σ2, T, r, X) ... Algorithm 2: Population-Localize(x0, P, λ, I) ... Algorithm 3: Aggregate-ERM( x, λ, J, ρ, {sℓ}ℓ [n J], R) ... Algorithm 4: Known Lip Reduction(D, C, µ, ρ, X, A) ... Algorithm 5: One Pass-Clipped-SGD(D, C, η, T, X, x0) ... Algorithm 6: SVT(D, {qi}i [T ], c, L, R, τ) ... Algorithm 7: Localized-Clipped-DP-SGD(D, x0, η, c, ε, δ) ... Algorithm 8: One Pass-Clipped-DP-SGD(D, n, X, x0, ρ)
Open Source Code No No explicit statement or link indicating the availability of open-source code for the methodology described in the paper.
Open Datasets No The paper is theoretical and defines 'P is a distribution over a sample space S' without referring to or providing access information for a specific public dataset.
Dataset Splits No The paper is theoretical and does not conduct experiments, so there are no training/test/validation splits described.
Hardware Specification No The paper is theoretical and does not describe any experiments, therefore no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe any experiments or specific software implementations that would require listing dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe any experiments or their setup, so no hyperparameter values or training configurations are mentioned.