Differentially Private Query Release Through Adaptive Projection
Authors: Sergul Aydore, William Brown, Michael Kearns, Krishnaram Kenthapadi, Luca Melis, Aaron Roth, Ankit A. Siva
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose, implement, and evaluate a new algorithm for releasing answers to very large numbers of statistical queries like k-way marginals, subject to differential privacy. Our algorithm makes adaptive use of a continuous relaxation of the Projection Mechanism, which answers queries on the private dataset using simple perturbation, and then attempts to find the synthetic dataset that most closely matches the noisy answers. We perform experimental evaluations across a range of parameters and datasets, and find that our method outperforms existing algorithms on large query classes. |
| Researcher Affiliation | Collaboration | 1Amazon Web Services AI/ML 2Columbia University, New York, NY, USA 3University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: William Brown <w.brown@columbia.edu>, Ankit Siva <ankitsiv@amazon.com>. |
| Pseudocode | Yes | Algorithm 1 Relaxed Projection (RP) Input: A vector of differentiable queries q : X r Rm , a vector of target answers ˆa Rm , and an initial dataset D (X r)n . |
| Open Source Code | Yes | We implement3 Algorithm 2 in Python (Van Rossum & Drake, 2009), using the JAX library (Bradbury et al., 2018) for auto-differentiation of queries and the Adam optimizer (Kingma & Ba, 2015)... 3github.com/amazon-research/relaxed-adaptive-projection |
| Open Datasets | Yes | We evaluate both our algorithm and FEM on the two datasets used by (Vietri et al., 2020) in their evaluation: ADULT and LOANS (Dua & Graff, 2017). UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml. |
| Dataset Splits | No | The paper mentions evaluating performance on datasets but does not provide specific details on training, validation, or test dataset splits or percentages. |
| Hardware Specification | Yes | We run our experiments for Algorithm 2 on an EC2 p2.xlarge instance (1 GPU, 4 CPUs, 61 GB RAM). For FEM we use the code from the authors of (Vietri et al., 2020)... we run FEM on a 2019 16 Mac Book Pro (6 CPUs, 16GB RAM). |
| Software Dependencies | Yes | We implement3 Algorithm 2 in Python (Van Rossum & Drake, 2009), using the JAX library (Bradbury et al., 2018) for auto-differentiation of queries and the Adam optimizer (Kingma & Ba, 2015)... Their code requires the Gurobi integer program solver... |
| Experiment Setup | Yes | For each call to RP, we do early stopping if the relative improvement on the loss function between consecutive Adam steps is less than 10 7. The number of maximum Adam steps per RP round is set to 5000. For the remaining hyperparameters K and T, we optimize over a small grid of values (see Table 2) and report the combination with the smallest error. This is also how error is reported for FEM. For all experiments we optimize over X r = [ 1, 1]d , which empirically had better convergence rates compared to using X r = [0, 1]d likely because gradients of our queries vanish at 0. Parameter Description Values K Queries per round 5 10 25 50 100 T Number of iterations 2 5 10 25 50 |