reproducibilityindex.ai

Private Synthetic Data for Multitask Learning and Marginal Queries

Authors: Giuseppe Vietri, Cedric Archambeau, Sergul Aydore, William Brown, Michael Kearns, Aaron Roth, Ankit Siva, Shuai Tang, Steven Z. Wu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide comprehensive empirical evaluations comparing RAP++ against several benchmarks, including PGM [MSM19], DP-CTGAN [FDK22], and DP-MERF [HAP21] on datasets derived from the US Census.
Researcher Affiliation	Collaboration	Giuseppe Vietri University of Minnesota vietr002@umn.edu Cedric Archambeau Amazon AWS AI/ML cedrica@amazon.com Sergul Aydore Amazon AWS AI/ML saydore@amazon.com William Brown Columbia University w.brown@columbia.edu Michael Kearns Amazon AWS AI/ML University of Pennsylvania mkearns@cis.upenn.edu Aaron Roth Amazon AWS AI/ML University of Pennsylvania aaroth@cis.upenn.edu Ankit Siva Amazon AWS AI/ML ankitsiv@amazon.com Shuai Tang Amazon AWS AI/ML shuat@amazon.com Zhiwei Steven Wu Amazon AWS AI/ML Carnegie Mellon University zhiweiw@andrew.cmu.edu
Pseudocode	Yes	Algorithm 1 Relaxed Projection with Sigmoid Temperature Annealing
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets	Yes	We use a suite of new datasets derived from US Census, which are introduced in [DHMS21]. These datasets include ﬁve pre-deﬁned prediction tasks, predicting labels related to income, employment, health, transportation, and housing. Each task deﬁnes feature columns (categorical and numerical) and target columns, where feature columns are used to train a model to predict the target column. We used the folktables package [DHMS21] to extract features and tasks.3 In the appendix, we include a table that summarizes the number of categorical and numerical features for each ACS task and the number of rows in each of the 25 datasets.
Dataset Splits	No	For each dataset, we use 80 percent of the rows as a training dataset and the remainder as a test dataset. No separate validation split was explicitly mentioned.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were provided in the paper.
Software Dependencies	No	The paper mentions JAX and Pytorch, but does not provide specific version numbers for software dependencies used in the experiments.
Experiment Setup	Yes	For both RAP and PGM we discretized all numerical features using an equal-sized binning, and compare the performance for numbers of bins in {10, 20, 30, 50, 100}. For the results, we choose 30 bins for discretization, as it performs well overall across different tasks and privacy levels. In the appendix we show more results for different choices of bin size. The other relevant parameter for both PGM and RAP is the number of epochs of adaptivity, which is ﬁxed to be d − 1, where d is the number of columns in the data. Next we describe the relevant parameters used to train RAP++ . Since RAP++ optimizes over two query classes (CM and LT), the first parameter TCM corresponds to the number of adaptive epochs for selecting CM queries. The other parameter TLT corresponds to the number of adaptive epochs for selecting LT queries. To be consistent with PGM and RAP we always choose TCM = d\|C\| − 1, where d\|C\| is the number of categorical columns in the data. Finally, K is the number of queries selected per epoch as described in algorithm 2. We ﬁx the parameters of RAP++ to be TLT = 50 and K = 10 since it works well across all settings in our experiments.