reproducibilityindex.ai

Low-Cost High-Power Membership Inference Attacks

Authors: Sajjad Zarifzadeh, Philippe Liu, Reza Shokri

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform our experiments on CIFAR-10, CIFAR-100, CINIC-10, Image Net and Purchase-100, which are benchmark datasets commonly used for MIA evaluations. We use the standard metrics, notably the FPR versus TPR curve, and the area under the ROC curve (AUC), for analyzing attacks.
Researcher Affiliation	Academia	1National University of Singapore (NUS), CS Department. Correspondence to: Sajjad Zarifzadeh <s.zari@nus.edu.sg>, Reza Shokri <reza@comp.nus.edu.sg>.
Pseudocode	Yes	B.1. Pseudo-code of RMIA Algorithm 1 MIA Score Computation with RMIA. The input to this algorithm is k reference models Θ, the target model θ, target (test) sample x, parameter γ, and a scaling factor a as described in Appendix B.2.2. We assume the reference models Θ are pre-trained on random samples from a population dataset available to adversary; each sample from the population dataset is included in training of half of reference models. The algorithm also takes an online flag which indicates whether we intend to run MIA in the online mode.
Open Source Code	Yes	The reader can reproduce our results using our source code.11 11https://github.com/privacytrustlab/ml privacy meter/tree/ master/research/2024 rmia (RMIA)
Open Datasets	Yes	We perform our experiments on CIFAR-10, CIFAR-100, CINIC-10, Image Net and Purchase-100, which are benchmark datasets commonly used for MIA evaluations.
Dataset Splits	No	The paper specifies how training sets are formed (e.g., "half of the dataset chosen at random") and mentions test accuracy, but it does not provide explicit details for training/validation/test splits, especially lacking information on a distinct validation set split.
Hardware Specification	No	The paper does not specify any particular hardware components (e.g., GPU models, CPU types, or memory) used for running the experiments. It focuses solely on software and model configurations.
Software Dependencies	No	The paper mentions machine learning libraries and models like "Wide Res Nets", "Res Net-50", "MLP", and refers to implementing prior work using their provided code, but it does not list specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For CIFAR-10 (He et al., 2016) (a traditional image classification dataset), we train a Wide Res Nets (with depth 28 and width 2) for 100 epochs on half of the dataset chosen at random. We set the batch size to 256. We assess the impact of attacks on larger datasets by examining the Image Net dataset, comprising approximately 1.2 million images with 1000 class labels. We train the Res Net-50 on half of the dataset for 100 epochs, with a batch size of 256, a learning rate of 0.1, and a weight decay of 1e-4. We also include the result of attacks on Purchase-100 dataset (a tabular dataset of shopping records) (Shokri et al., 2017), where models are 4-layer MLP with layer units=[512, 256, 128, 64], trained on 25k samples for 50 epochs.