Relaxed Marginal Consistency for Differentially Private Query Answering

Authors: Ryan McKenna, Siddhant Pradhan, Daniel R. Sheldon, Gerome Miklau

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We begin by comparing the accuracy and scalability of APPGM and PRIVATE-PGM for estimating a fixed workload of marginal queries from noisy measurements of those marginals made by the Laplace mechanism. We use synthetic data to control the data domain and distribution (details in Appendix C.1) and measure k different 3-way marginals with = 1, for different settings of k. We run PROX-PGM with different versions of MARGINAL-ORACLE: Exact, Saturated Region Graph, and Factor Graph, where the first corresponds to PRIVATE-PGM, and the latter two to APPGM with the corresponding region graph (Figure 1). Accuracy. We first show that when exact inference is tractable, some accuracy is lost by using approximate inference, but the estimated pseudo-marginals are much better than the noisy ones. We use eight-dimensional data with n1 = = n8 = 4, which is small enough so exact inference is always tractable, and measure random 3-way marginals for k from 1 to 8 3 = 56. We then run 10000 iterations of PROX-PGM using differing choices for MARGINAL-ORACLE, and report the L1 error on inand out-of-model marginals, averaged over all cliques and across five trials. In Figure 2a, we see that the error of all PROX-PGM variants is always lower than the error of the noisy measurements themselves. Exact (PRIVATE-PGM) always has lowest error, followed by Saturated Region Graph, then Factor Graph. This matches expectations: richer region graph structures encode more of the actual constraints and hence provide better error. The trend is similar for out-of-model marginals (Figure 2b). Factor Graph, which only enforces consistency with respect to the one-way marginals, performs poorly on unmeasured cliques, while Saturated Region Graph and Exact, which enforce more constraints, do substantially better. The difference between Saturated Region Graph and Exact is smaller, but meaningful. Scalability. Next we consider high-dimensional data and compare the scalability of Exact and Saturated Region Graph on 100-dimensional data with n1 = = n100 = 10. We vary k from 1 to 104 and calculate the per-iteration time of PROX-PGM.3 We consider two schemes for selecting measured cliques: random selects triples of attributes uniformly at random, and greedy selects triples to minimize the junction tree size in each iteration. As shown in Figure 2c, Exact can handle about 50 random measured cliques or 1000 greedy ones before the per-iteration time becomes too expensive (the growth rate is exponential). In contrast, Saturated Region Graph runs with 10000 measured cliques for either strategy and could run on larger cases (the growth rate is linear). Improving Scalability and Accuracy in Sophisticated Mechanisms We show next how PROXPGM can be used to improve the performance of two sophisticated mechanisms for answering complex workloads of linear queries, including marginals. HDMM [32] is a state-of-the-art algorithm that first selects a strategy set of (weighted) marginal queries to be measured, and then reconstructs answers to workload queries. MWEM [5] is another competitive mechanism that iteratively measures poorly approximated queries to improve a data distribution approximation. In each algorithm, the bottleneck in high dimensions is estimating a data distribution ˆp, which is prohibitive to do explicitly. PRIVATE-PGM can extend each algorithm to run in higher dimensions [16], but still becomes infeasible with enough dimensions and measurements. HDMM is a batch algorithm and either can or cannot run for a particular workload. Because MWEM iteratively selects measurements, even in high dimensions it can run for some number of iterations before the graphical model structure becomes too complex. By using APPGM instead, HDMM can run for workloads that were previously impossible, and MWEM can run for any number of rounds. We use the FIRE dataset from the 2018 NIST synthetic data competition [40], which includes 15 attributes and m 300,000 individuals. For HDMM, we consider workloads of k random 3-way marginals, for k = 1, 2, 4, 8, . . . , 256, 455, run five trials, and report root mean squared error, the objective function that HDMM optimizes. Figure 3a shows the results. HDMM with a full data vector cannot run for k > 2, but we can still analytically compute the expected error if it were able to run. HDMM + Exact fails to run beyond k = 16, while HDMM + Region Graph is able to run in every setting, substantially expanding the range of settings in which HDMM can be used. Both variants offer the significant error improvements of PGM-style inference, because they impose non-negativity constraints that reduce error. For example, when k = 455, there is a 3 reduction in RMSE. For MWEM, we consider the workload of all 2-way marginals, use a privacy budget of = 0.1 per round, and run for as may rounds as possible, until MWEM has measured all 2-way marginals or exceeds a generous time/memory limit of 24 hours and 16GB. Figure 3b shows the results. As expected, MWEM + Exact runs successfully in early iterations, but exceeds resource limits by 35 40 rounds. In comparison, MWEM + Region Graph can run to completion and eventually measure all 2-way marginals. For a fixed number of rounds, Exact has lower error, in this case, substantially so, but results are data-dependent (we evaluate with other datasets in Appendix C). The difference can largely be traced to Exact s better performance on out-of-model cliques. In contrast, HDMM s measurements support all workload queries, so no out-of-model inference is required, and we see little gap between exact and approximate inference. Improving performance of approximate inference for out-of-model marginals is an area to be considered for future work; see Appendix B. Additional experiments We also apply APPGM to improve the accuracy of FEM, a recent state-ofthe-art query-answering mechanism [33]. The setup is similar to the Dual Query experiment in [16]: we run FEM to completion to release y, but instead of answering queries directly with y, we use APPGM to estimate pseudo-marginals, from which we compute query answers. This leads to a modest error reduction for all (Figure 3c; details in Appendix C). In Appendix C we also compare PRIVATE-PGM and APPGM directly to a method proposed in Pri View [9] for estimating consistent marginals, and find that PGM-based methods are more accurate for almost all values of . We additionally compare PRIVATE-PGM and APPGM to the recent Relaxed Projection method [14], and found that APPGM performs better for > 0.1, although is outperformed for 0.1.
Researcher Affiliation Academia Ryan Mc Kenna, Siddhant Pradhan, Daniel Sheldon, Gerome Miklau College of Information and Computer Sciences University of Massachusetts Amherst, MA 01002 {rmckenna, sspradhan, sheldon, miklau}@cs.umass.edu
Pseudocode Yes Algorithm 1 PROX-PGM [16] Input: Convex loss function L(µ) Output: Marginals ˆµ, parameters ˆθ ˆθ = 0 for t = 1, . . . , T do ˆµ =MARGINAL-ORACLE(ˆθ) ˆθ = ˆθ ηt L(ˆµ) return ˆµ, ˆθ
Open Source Code No The paper does not provide a direct link to the source code for the methodology described, nor does it explicitly state that the code will be released or is available in supplementary materials.
Open Datasets Yes We use the FIRE dataset from the 2018 NIST synthetic data competition [40], which includes 15 attributes and m 300,000 individuals. [...] We also apply APPGM to improve the accuracy of FEM, a recent state-ofthe-art query-answering mechanism [33]. The setup is similar to the Dual Query experiment in [16]: ... (Figure 3c; details in Appendix C). In Appendix C we also compare PRIVATE-PGM and APPGM directly to a method proposed in Pri View [9] for estimating consistent marginals, and find that PGM-based methods are more accurate for almost all values of . We additionally compare PRIVATE-PGM and APPGM to the recent Relaxed Projection method [14], and found that APPGM performs better for > 0.1, although is outperformed for 0.1.
Dataset Splits No The paper mentions using synthetic data and various datasets (FIRE, ADULT) but does not specify training, validation, or test splits (e.g., percentages, counts, or references to predefined splits).
Hardware Specification Yes Scalability experiments were conducted on two cores of a machine with a 2.4GHz CPU and 16 GB of RAM.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., specific libraries, frameworks, or solvers with their versions).
Experiment Setup Yes We then run 10000 iterations of PROX-PGM using differing choices for MARGINAL-ORACLE [...] MWEM, we consider the workload of all 2-way marginals, use a privacy budget of = 0.1 per round, and run for as may rounds as possible, until MWEM has measured all 2-way marginals or exceeds a generous time/memory limit of 24 hours and 16GB. [...] We utilize a simple heuristic that seems to work well in most settings, but may require manual tuning in some cases. Second, we observed that introducing damping into CONVEX-GBP improved its stability and robustness, especially for very dense region graphs. Finally our MWEM experiment revealed that it is better to use PRIVATE-PGM over APPGM in the context of an MWEM-style algorithm, even though PRIVATE-PGM is more limited in the number of rounds it can run for.