Learning from Mixtures of Private and Public Populations

Authors: Raef Bassily, Shay Moran, Anupama Nandi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our construction outputs a hypothesis with excess true error α using an input sample of size O(d2/ϵα) in the realizable setting, and a sample of size O(d2 max{1/ϵ, 1/α}) in the agnostic setting. Our algorithm is an improper learner; specifically, the output hypothesis is given by the intersection of at most d halfspaces. ... The main goal of this work is to introduce a new, more flexible framework for differentially private learning that captures more realistic scenarios than prior works.
Researcher Affiliation Academia Raef Bassily Department of Computer Science & Engineering The Ohio State University bassily.1@osu.edu; Shay Moran Department of Mathematics Technion Israel Institute of Technology smoran@technion.ac.il; Anupama Nandi Department of Computer Science & Engineering The Ohio State University nandi.10@osu.edu
Pseudocode Yes Algorithm 1 AConstr Half: Construction of the family e Cpub halfspaces; Algorithm 2 ALearn Half: PPM Learning of Halfspaces
Open Source Code No The paper makes no mention of open-source code availability or provides any links to code repositories. It refers to a 'full version [BMN20]' for additional details, which is a reference to an arXiv preprint, not a code release.
Open Datasets No The paper does not use or refer to any specific publicly available dataset. It discusses theoretical distributions (Dpriv, Dpub) and uses illustrative examples like medical studies or credit-worthiness, but no actual dataset access information is provided.
Dataset Splits No The paper is theoretical and focuses on sample complexity bounds for learning. It does not describe any experimental setup involving training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and does not conduct empirical experiments, so it does not specify any hardware used.
Software Dependencies No The paper is theoretical and does not conduct empirical experiments, so it does not list any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on algorithm design and analysis (e.g., sample complexity). It does not describe an experimental setup with specific hyperparameters, training configurations, or system-level settings.