Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Graphical-model based estimation and inference for differential privacy
Authors: Ryan Mckenna, Daniel Sheldon, Gerome Miklau
ICML 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we measure the accuracy and scalability improvements enabled by probabilistic graphical-model (PGM) based estimation when it is incorporated into existing privacy mechanisms. |
| Researcher Affiliation | Academia | 1University of Massachusetts, Amherst 2Mount Holyoke College. |
| Pseudocode | Yes | Algorithm 1 Proximal Estimation Algorithm; Algorithm 2 Accelerated Proximal Estimation Algorithm |
| Open Source Code | No | The paper does not include any statement about making its source code publicly available or provide a link to a code repository for the methodology described. |
| Open Datasets | No | The paper uses datasets such as 'Titanic', 'Adult', 'Loans', and 'Stroke' and lists their properties in Table 1. However, it does not provide direct links, DOIs, repository names, or specific bibliographic citations for accessing these datasets. |
| Dataset Splits | No | The paper focuses on estimating data distributions from noisy measurements and evaluating query accuracy, rather than training a predictive model using traditional train/validation/test dataset splits. Therefore, no such explicit splits are mentioned or provided for the experimental setup. |
| Hardware Specification | No | The paper states: 'Experiments are done on 2 cores of a single compute cluster node with 16 GB of RAM and 2.4 GHz processors.' This provides some general specifications but lacks specific CPU or GPU models, or more detailed hardware components needed for full reproducibility. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). While it mentions tools like 'Autograd' and 'LSMR', it does not list their versions or other required software with specific versions. |
| Experiment Setup | Yes | We run Algorithm 1 with line search for Dual Query and Algorithm 2 for the other mechanisms, each for 10000 iterations. We use a privacy budget of = 1.0 (and δ = 0.001 for Dual Query). The variable ηt in this algorithm is a step size, which can be constant, decreasing, or found via line search. |