reproducibilityindex.ai

Auditing Private Prediction

Authors: Karan Chadha, Matthew Jagielski, Nicolas Papernot, Christopher A. Choquette-Choo, Milad Nasr

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that (i) the privacy analysis of private prediction can be improved, (ii) algorithms which are easier to poison lead to much higher privacy leakage, and (iii) the privacy leakage is signiﬁcantly lower for adversaries without query control than those with full control.
Researcher Affiliation	Collaboration	1Stanford University, Part of work done while an intern at Google DeepMind. 2Google DeepMind.
Pseudocode	Yes	Algorithm 1 Private prediction framework
Open Source Code	No	The paper states 'Karan Chadha wrote the code for the experiments.' but does not provide an explicit statement about releasing the source code for the described methodology or a link to a repository.
Open Datasets	Yes	We audit PATE on the MNIST, CIFAR10 and Fashion MNIST (Xiao et al., 2017) datasets. For Prompt PATE, we work with the SST2 (Sentiment Classiﬁcation) (Socher et al., 2013), AGNEWS (Article Classiﬁcation) (Zhang et al., 2015b), DBPedia (Topic Classiﬁcation) (Zhang et al., 2015a) and TREC (Question Classiﬁcation) (Li & Roth, 2002) datasets.
Dataset Splits	No	The paper mentions using training and test datasets (e.g., 'training dataset S', 'simulated using the test dataset'), and provides total dataset sizes, but does not explicitly state the specific train/validation/test splits (e.g., percentages, sample counts, or clear references to predefined splits for reproducibility).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions using optimizers (Adam) and pre-trained models (MISTRAL-7B, Vi T-L/16, Ro BERTa-Large), but does not specify software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	For MNIST and Fashion MNIST, we train 250 teachers and use Gaussian noise with σ = 40 to calculate the noisy argmax and for CIFAR10, we train 200 teachers and use Gaussian noise with σ = 25. [...] We train all networks with the Adam optimizer (Kingma & Ba, 2014) with learning rate set to 0.03 and a batch size of 16. For all datasets, we use 200 teachers and one example per teacher which is randomly sampled from the respective datasets. [...] with k = 200 and subsampling rate γ = 0.2 and using Gaussian noise with standard deviation σ = 30 for image datasets and σ = 20 for text datasets.