reproducibilityindex.ai

Limits of Private Learning with Access to Public Data

Authors: Noga Alon, Raef Bassily, Shay Moran

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study the limits of learning in this setting in terms of private and public sample complexities. We show that any hypothesis class of VC-dimension d can be agnostically learned up to an excess error of α using only (roughly) d/α public examples and d/α2 private labeled examples. This result holds even when the public examples are unlabeled. This gives a quadratic improvement over the standard d/α2 upper bound on the public sample complexity (where private examples can be ignored altogether if the public examples are labeled). Furthermore, we give a nearly matching lower bound, which we prove via a generic reduction from this setting to the one of private learning without public data.
Researcher Affiliation	Collaboration	Noga Alon Department of Mathematics Princeton University nalon@math.princeton.edu Raef Bassily Department of Computer Science & Engineering The Ohio State University bassily.1@osu.edu Shay Moran Google AI Princeton shaymoran1@gmail.com
Pseudocode	Yes	Algorithm 1 ASSPP: Semi-Supervised Semi-Private Agnostic Learner... Algorithm 2 Description of the private learner B:... Algorithm 3 Private Sample Generator Priv Samp:... Algorithm 4 Public Sample Generator Pub Samp:
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper is theoretical and focuses on sample complexities and bounds for learning. It defines concepts like "private sample" and "public sample" but does not use or specify a named, publicly available dataset for empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not conduct empirical experiments with data splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe empirical experiments, thus no hardware specifications for running experiments are provided.
Software Dependencies	No	The paper is theoretical and does not describe empirical experiments requiring specific software versions. It presents algorithms and proofs.
Experiment Setup	No	The paper is theoretical and does not describe empirical experiments. Therefore, there are no specific experimental setup details like hyperparameters or system-level training settings.