Differentially Private Learning with Margin Guarantees

Authors: Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present a series of new differentially private (DP) algorithms with dimensionindependent margin guarantees. For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarantees, as well as an efficient DP learning algorithm with margin guarantees. We also present a new efficient DP learning algorithm with margin guarantees for kernel-based hypotheses with shift-invariant kernels, such as Gaussian kernels, and point out how our results can be extended to other kernels using oblivious sketching techniques. We further give a pure DP learning algorithm for a family of feed-forward neural networks for which we prove margin guarantees that are independent of the input dimension. Additionally, we describe a general label DP learning algorithm, which benefits from relative deviation margin bounds and is applicable to a broad family of hypothesis sets, including that of neural networks. Finally, we show how our DP learning algorithms can be augmented in a general way to include model selection, to select the best confidence margin parameter.
Researcher Affiliation Collaboration Raef Bassily The Ohio State University & Google Research NY bassily.1@osu.edu Mehryar Mohri Google Research & Courant Institute mohri@google.com Ananda Theertha Suresh Google Research, NY theertha@google.com
Pseudocode Yes Algorithm 1 APriv Mrg: Private Learner of Linear Classifiers with Margin Guarantees
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. The author checklist states 'N/A' for inclusion of code.
Open Datasets No The paper is theoretical and does not conduct empirical studies with specific datasets. It refers to data generically as 'a sample S of size m from D' without mentioning any named public datasets or providing access information for any data.
Dataset Splits No The paper is theoretical and does not conduct empirical studies that would involve dataset splits. The author checklist indicates 'N/A' for training details including data splits.
Hardware Specification No The paper does not conduct empirical experiments and therefore does not describe hardware specifications used. The author checklist states 'N/A' for compute resources.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies with version numbers required for replication of experiments.
Experiment Setup No The paper is theoretical and does not conduct empirical experiments, thus it does not provide details about an experimental setup, such as hyperparameters or system-level training settings.