(Near) Dimension Independent Risk Bounds for Differentially Private Learning

Authors: Prateek Jain, Abhradeep Guha Thakurta

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide empirical evaluation of our proposed methods and compare them against the objective/output perturbation methods of Chaudhuri et al. (2011) over benchmark data sets. We show that the methods of Chaudhuri et al. (2011) indeed incur test error that grows with p, while our method is able to obtain accurate predictions even for high-dimensional data sets. Similarly, we also evaluate our proposed sampling based method for privacy preserving learning over simplex by simulations over a benchmark data set.
Researcher Affiliation Collaboration Prateek Jain PRAJAIN@MICROSOFT.COM Microsoft Research Abhradeep Thakurta B-ABHRAG@MICROSOFT.COM Stanford University and Microsoft Research
Pseudocode No The paper describes algorithms but does not include any clearly labeled pseudocode or algorithm blocks as figures or structured text.
Open Source Code No The paper mentions "our code uses a modification of the LIBLINEAR method" but does not provide any link or explicit statement about making their specific implementation's source code available.
Open Datasets Yes For our first set of experiments, we apply SVM based classifiers on two benchmark datasets: URL and Cod-RNA. We use a subset of the URL dataset which has 100, 000 data points and its dimensionality is around 20M. Cod-RNA has around 60K data points and its dimensionality is 8.
Dataset Splits Yes We use 70% of the data for training and the remaining 30% for test.
Hardware Specification No The paper does not specify any particular hardware (CPU, GPU, memory, etc.) used for running the experiments.
Software Dependencies No The paper mentions that "our code uses a modification of the LIBLINEAR method for solving the perturbed SVM problem" but does not provide version numbers for LIBLINEAR or any other software dependencies.
Experiment Setup Yes We set the regularization parameter λ = 0.001 and δ = 10 3. ... We conduct experiments on Cod-RNA dataset with ǫ = 10, δ = 10 3 and by using 70% of the data for training and the remaining for test.