Low-Cost High-Power Membership Inference Attacks
Authors: Sajjad Zarifzadeh, Philippe Liu, Reza Shokri
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform our experiments on CIFAR-10, CIFAR-100, CINIC-10, Image Net and Purchase-100, which are benchmark datasets commonly used for MIA evaluations. We use the standard metrics, notably the FPR versus TPR curve, and the area under the ROC curve (AUC), for analyzing attacks. |
| Researcher Affiliation | Academia | 1National University of Singapore (NUS), CS Department. Correspondence to: Sajjad Zarifzadeh <s.zari@nus.edu.sg>, Reza Shokri <reza@comp.nus.edu.sg>. |
| Pseudocode | Yes | B.1. Pseudo-code of RMIA Algorithm 1 MIA Score Computation with RMIA. The input to this algorithm is k reference models Θ, the target model θ, target (test) sample x, parameter γ, and a scaling factor a as described in Appendix B.2.2. We assume the reference models Θ are pre-trained on random samples from a population dataset available to adversary; each sample from the population dataset is included in training of half of reference models. The algorithm also takes an online flag which indicates whether we intend to run MIA in the online mode. |
| Open Source Code | Yes | The reader can reproduce our results using our source code.11 11https://github.com/privacytrustlab/ml privacy meter/tree/ master/research/2024 rmia (RMIA) |
| Open Datasets | Yes | We perform our experiments on CIFAR-10, CIFAR-100, CINIC-10, Image Net and Purchase-100, which are benchmark datasets commonly used for MIA evaluations. |
| Dataset Splits | No | The paper specifies how training sets are formed (e.g., "half of the dataset chosen at random") and mentions test accuracy, but it does not provide explicit details for training/validation/test splits, especially lacking information on a distinct validation set split. |
| Hardware Specification | No | The paper does not specify any particular hardware components (e.g., GPU models, CPU types, or memory) used for running the experiments. It focuses solely on software and model configurations. |
| Software Dependencies | No | The paper mentions machine learning libraries and models like "Wide Res Nets", "Res Net-50", "MLP", and refers to implementing prior work using their provided code, but it does not list specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For CIFAR-10 (He et al., 2016) (a traditional image classification dataset), we train a Wide Res Nets (with depth 28 and width 2) for 100 epochs on half of the dataset chosen at random. We set the batch size to 256. We assess the impact of attacks on larger datasets by examining the Image Net dataset, comprising approximately 1.2 million images with 1000 class labels. We train the Res Net-50 on half of the dataset for 100 epochs, with a batch size of 256, a learning rate of 0.1, and a weight decay of 1e-4. We also include the result of attacks on Purchase-100 dataset (a tabular dataset of shopping records) (Shokri et al., 2017), where models are 4-layer MLP with layer units=[512, 256, 128, 64], trained on 25k samples for 50 epochs. |