Limits of Private Learning with Access to Public Data
Authors: Noga Alon, Raef Bassily, Shay Moran
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the limits of learning in this setting in terms of private and public sample complexities. We show that any hypothesis class of VC-dimension d can be agnostically learned up to an excess error of α using only (roughly) d/α public examples and d/α2 private labeled examples. This result holds even when the public examples are unlabeled. This gives a quadratic improvement over the standard d/α2 upper bound on the public sample complexity (where private examples can be ignored altogether if the public examples are labeled). Furthermore, we give a nearly matching lower bound, which we prove via a generic reduction from this setting to the one of private learning without public data. |
| Researcher Affiliation | Collaboration | Noga Alon Department of Mathematics Princeton University nalon@math.princeton.edu Raef Bassily Department of Computer Science & Engineering The Ohio State University bassily.1@osu.edu Shay Moran Google AI Princeton shaymoran1@gmail.com |
| Pseudocode | Yes | Algorithm 1 ASSPP: Semi-Supervised Semi-Private Agnostic Learner... Algorithm 2 Description of the private learner B:... Algorithm 3 Private Sample Generator Priv Samp:... Algorithm 4 Public Sample Generator Pub Samp: |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and focuses on sample complexities and bounds for learning. It defines concepts like "private sample" and "public sample" but does not use or specify a named, publicly available dataset for empirical evaluation. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical experiments with data splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe empirical experiments, thus no hardware specifications for running experiments are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe empirical experiments requiring specific software versions. It presents algorithms and proofs. |
| Experiment Setup | No | The paper is theoretical and does not describe empirical experiments. Therefore, there are no specific experimental setup details like hyperparameters or system-level training settings. |