Model-Agnostic Private Learning

Authors: Raef Bassily, Om Thakkar, Abhradeep Guha Thakurta

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We design differentially private learning algorithms that are agnostic to the learning model assuming access to a limited amount of unlabeled public data. First, we provide a new differentially private algorithm for answering a sequence of m online classification queries (given by a sequence of m unlabeled public feature vectors) based on a private training set. Our algorithm follows the paradigm of subsample-and-aggregate... We show that our algorithm makes a conservative use of the privacy budget... In the PAC model, we analyze our construction and prove upper bounds on the sample complexity for both the realizable and the non-realizable cases. Similar to non-private sample complexity, our bounds are completely characterized by the VC dimension of the concept class.
Researcher Affiliation Academia Department of Computer Science & Engineering, The Ohio State University. bassily.1@osu.edu Department of Computer Science, Boston University. omthkkr@bu.edu Department of Computer Science, University of California Santa Cruz. aguhatha@ucsc.edu
Pseudocode Yes Algorithm 1 Astab [26]: Private release of a classification query via distance to instability; Algorithm 2 Abin Clas: Private Online Binary Classification via subsample and aggregate; and sparse vector; Algorithm 3 APriv: Private Learner
Open Source Code No The paper does not provide any links to open-source code or explicitly state that code for the described methodology is being released.
Open Datasets No The paper is theoretical and does not use specific, named, publicly available datasets. It refers to a 'private labeled dataset denoted by D' and 'unlabeled public data' but does not provide access information or citations for them.
Dataset Splits No The paper is theoretical and does not conduct experiments that would require specific training/validation/test dataset splits. It mentions 'standard validation techniques' in a remark, but not in the context of its own empirical setup.
Hardware Specification No The paper is theoretical and does not describe any experimental setup or the hardware used to run experiments.
Software Dependencies No The paper is theoretical and does not describe any specific software dependencies with version numbers for experimental reproducibility.
Experiment Setup No The paper is theoretical and does not describe specific experimental setup details, hyperparameters, or training configurations.