reproducibilityindex.ai

A General Framework for Auditing Differentially Private Machine Learning

Authors: Fred Lu, Joseph Munoz, Maya Fuchs, Tyler LeBlond, Elliott Zaresky-Williams, Edward Raff, Francis Ferraro, Brian Testa

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate significantly improved auditing power over previous approaches on a variety of models including logistic regression, Naive Bayes, and random forest.
Researcher Affiliation	Collaboration	Fred Lu Booz Allen Hamilton Joseph Munoz Booz Allen Hamilton Maya Fuchs Booz Allen Hamilton Tyler Le Blond Booz Allen Hamilton Elliott Zaresky-Williams Booz Allen Hamilton Edward Raff Booz Allen Hamilton Francis Ferraro University of Maryland, Baltimore County Brian Testa Air Force Research Laboratory
Pseudocode	Yes	Algorithm 1 ML-Audit: Optimizing output set S
Open Source Code	No	The paper mentions using existing libraries like 'diffprivlib' and 'Opacus' but does not provide any statement or link indicating that the authors have released the source code for their own framework or methodology described in the paper.
Open Datasets	Yes	We assess Naive Bayes, logistic regression (output and objective perturbation), and random forest on common machine learning datasets: adult, credit, iris, breast-cancer, banknote, thoracic. We use the diffprivlib library [27] and implement output perturbation following [22]. Additionally, we test DP-SGD on FMNIST and CIFAR10 (here with N = 500) using [28].
Dataset Splits	No	The paper describes the number of times models were trained (N=10000 or N=500 for DP-SGD) for Monte Carlo estimates, but it does not provide specific train/validation/test dataset splits for the machine learning models themselves. Algorithm 1 refers to training a classifier for `p(D\|z)` based on samples, not a general validation split for the main models.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. It only mentions general experimental setups without hardware context.
Software Dependencies	No	The paper mentions using the 'diffprivlib library' and 'Opacus' but does not specify the version numbers for these or any other software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	We evaluate the DP mechanisms over a range of "th: {0.1, 0.25, 0.5, 1, 2, 4, 8, 16, 50}. For a given dataset D, we perturb k 2 {1, 2, 4, 8} points to get D0 and train N = 10000 times for each to determine the appropriate auditing set S. Then we obtain N new samples to perform the final Monte Carlo estimate and obtain the lower bound ˆ"lb. We use confidence level = 0.05 throughout.