Oracle Efficient Private Non-Convex Optimization

Authors: Seth Neel, Aaron Roth, Giuseppe Vietri, Steven Wu

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We complement our theoretical results with an empirical evaluation of the non-convex case, in which we use an integer program solver as our optimization oracle. We find that for the problem of learning linear classifiers, directly optimizing for 0/1 loss using our approach can out-perform the more standard approach of privately optimizing a convexsurrogate loss function on the Adult dataset. For our experiments, we consider the problem of privately learning a linear threshold function to solve a binary classification task.
Researcher Affiliation Academia 1Wharton Statistics Department, University of Pennsylvania, Philadelphia, Pennsylvania, USA 2Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, Pennsylvania, USA 3Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA.
Pseudocode Yes Algorithm 1 Objective Perturbation over Discrete Space OPDisc; Algorithm 2 Objective Perturbation Sampling OPSamp
Open Source Code No No. The paper does not provide a direct statement about releasing its source code or a link to a code repository for the described methodology. It mentions using the TensorFlow Privacy Library but does not state releasing its own code.
Open Datasets Yes We use the Adult dataset (Lichman, 2013), a common benchmark dataset derived from Census data.
Dataset Splits No No. The paper describes the creation of a balanced subset of the Adult dataset (n=15682) but does not provide specific train/validation/test dataset splits. It mentions running a grid search to identify parameters for private logistic regression, but not specific split percentages.
Hardware Specification No No. The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No No. The paper mentions using the 'TensorFlow Privacy Library' and the 'Gurobi MIP solver' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The algorithm involves three parameters: gradient clip norm, mini-batch size, and learning rate. For each target privacy parameters (ϵ, δ), we run a grid search to identify the triplet of parameters that give the highest accuracy. ... In OPDisc, each coordinate wj can take values in the discrete set { B, B + 1, . . . , B 1, B} with B = d , and we constrain the w 2 to be at most d. In RSPM, we optimize over the set { 1, 0, 1}d.