Differentially Private Generalized Linear Models Revisited
Authors: Raman Arora, Raef Bassily, Cristóbal Guzmán, Michael Menart, Enayat Ullah
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the problem of (ǫ, δ)-differentially private learning of linear predictors with convex losses. We provide results for two subclasses of loss functions. The first case is when the loss is smooth and non-negative but not necessarily Lipschitz (such as the squared loss). For this case, we establish an upper bound on the excess population risk of e O w n + min n w 2 nǫ o , where n is the number of samples, d is the dimension of the problem, and w is the minimizer of the population risk. Apart from the dependence on w , our bound is essentially tight in all parameters. In particular, we show a lower bound of eΩ 1 n + min n w 4/3 nǫ o . We also revisit the previously studied case of Lipschitz losses [SSTT20]. For this case, we close the gap in the existing work and show that the optimal rate is (up to log factors) Θ w n + min n w nǫ , nǫ o , where rank is the rank of the design matrix. This improves over existing work in the high privacy regime. Finally, our algorithms involve a private model selection approach that we develop to enable attaining the stated rates without a-priori knowledge of w .Our work is largely theoretical. Moreover, there are no foreseeable negative societal impacts for our proposed algorithms. |
| Researcher Affiliation | Academia | Raman Arora Department of Computer Science The Johns Hopkins University arora@cs.jhu.edu Raef Bassily Department of Computer Science & Engineering Translational Data Analytics Institute (TDAI) The Ohio State University bassily.1@osu.edu Cristóbal Guzmán Inst. for Mathematical and Comput. Eng. Fac. de Matemáticas and Esc. de Ingeniería Pontificia Universidad Católica de Chile c.guzman@utwente.nl Michael Menart Department of Computer Science & Engineering The Ohio State University menart.2@osu.edu Enayat Ullah Department of Computer Science The Johns Hopkins University enayat@jhu.edu |
| Pseudocode | Yes | Algorithm 1 Noisy GD Input: Dataset S, loss function ℓ, constraint set W, steps T, learning rate η, noise scale σ2 1: w0 = 0 2: for t = 1 to T 1 do 3: ξ N(0, σ2I) 4: wt+1 = ΠW wt η b L(wt; S) + ξ where ΠW is the Euclidean projection on to W 5: end for Output: bw = 1 T PT t=1 wj ... Algorithm 2 JL Method ... Algorithm 3 Regularized Output Perturbation ... Algorithm 4 Private Grid Search |
| Open Source Code | No | The paper does not provide explicit access to source code for the methodology. The ethics statement indicates N/A for code. |
| Open Datasets | No | The paper describes theoretical concepts like 'training set S1 (X Y)n/2' but does not use or provide access to any actual dataset for empirical training. The ethics statement indicates N/A for data. |
| Dataset Splits | No | The paper mentions 'validation set S2' as part of Algorithm 4 for theoretical model selection, but it does not specify any dataset splits for validation in empirical experiments. The ethics statement indicates N/A for data. |
| Hardware Specification | No | The paper is theoretical and does not conduct experiments, therefore no hardware specifications are mentioned. The ethics statement indicates N/A for compute resources. |
| Software Dependencies | No | The paper is theoretical and does not conduct experiments, so no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper is theoretical and does not conduct experiments, so no experimental setup details like hyperparameters or training settings are provided. |