Differentially Private Query Release Through Adaptive Projection

Authors: Sergul Aydore, William Brown, Michael Kearns, Krishnaram Kenthapadi, Luca Melis, Aaron Roth, Ankit A. Siva

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like k-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation, and then attempts to find the synthetic dataset that most closely matches the noisy answers. We perform experimental evaluations across a range of parameters and datasets, and find that our method outperforms existing algorithms on large query classes.
Researcher Affiliation Collaboration 1Amazon Web Services AI/ML 2Columbia University, New York, NY, USA 3University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: William Brown <w.brown@columbia.edu>, Ankit Siva <ankitsiv@amazon.com>.
Pseudocode Yes Algorithm 1 Relaxed Projection (RP) Input: A vector of differentiable queries q : X r Rm , a vector of target answers ˆa Rm , and an initial dataset D (X r)n .
Open Source Code Yes We implement3 Algorithm 2 in Python (Van Rossum & Drake, 2009), using the JAX library (Bradbury et al., 2018) for auto-differentiation of queries and the Adam optimizer (Kingma & Ba, 2015)... 3github.com/amazon-research/relaxed-adaptive-projection
Open Datasets Yes We evaluate both our algorithm and FEM on the two datasets used by (Vietri et al., 2020) in their evaluation: ADULT and LOANS (Dua & Graff, 2017). UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
Dataset Splits No The paper mentions evaluating performance on datasets but does not provide specific details on training, validation, or test dataset splits or percentages.
Hardware Specification Yes We run our experiments for Algorithm 2 on an EC2 p2.xlarge instance (1 GPU, 4 CPUs, 61 GB RAM). For FEM we use the code from the authors of (Vietri et al., 2020)... we run FEM on a 2019 16 Mac Book Pro (6 CPUs, 16GB RAM).
Software Dependencies Yes We implement3 Algorithm 2 in Python (Van Rossum & Drake, 2009), using the JAX library (Bradbury et al., 2018) for auto-differentiation of queries and the Adam optimizer (Kingma & Ba, 2015)... Their code requires the Gurobi integer program solver...
Experiment Setup Yes For each call to RP, we do early stopping if the relative improvement on the loss function between consecutive Adam steps is less than 10 7. The number of maximum Adam steps per RP round is set to 5000. For the remaining hyperparameters K and T, we optimize over a small grid of values (see Table 2) and report the combination with the smallest error. This is also how error is reported for FEM. For all experiments we optimize over X r = [ 1, 1]d , which empirically had better convergence rates compared to using X r = [0, 1]d likely because gradients of our queries vanish at 0. Parameter Description Values K Queries per round 5 10 25 50 100 T Number of iterations 2 5 10 25 50