On the Importance of Difficulty Calibration in Membership Inference Attacks
Authors: Lauren Watson, Chuan Guo, Graham Cormode, Alexandre Sablayrolles
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the effect of difficulty calibration, we perform a comprehensive evaluation of several score-based attacks on standard benchmark datasets. |
| Researcher Affiliation | Collaboration | Lauren Watson University of Edinburgh Chuan Guo Graham Cormode Meta AI Alexandre Sablayrolles. Work done during an internship at Facebook. Email:lauren.watson@ed.ac.uk, {chuanguo, gcormode, asablayrolles}@fb.com |
| Pseudocode | No | The paper describes its methods and algorithms in paragraph text and mathematical equations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | An implementation of these attacks is available at https://github.com/facebookresearch/ calibration_membership. |
| Open Datasets | Yes | We perform experiments on several benchmark classification datasets: German Credit, Hepatitis and Adult datasets from the UCI Machine Learning Repository (Dua & Graff, 2017), MNIST (Le Cun et al., 1998), CIFAR10/100 (Krizhevsky et al., 2009), and Image Net (Deng et al., 2009). |
| Dataset Splits | Yes | We split the data into two sets: a private set, known only to the trainer, and a public set, which is used for training reference models and selecting the decision threshold τ. The trainer trains their model h on half of the private set, keeping the other half as non-members. ... To find a threshold for optimal accuracy, we first split the public set of examples in half again, and treat one half as members, with the rest as non-members. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions the use of the 'Opacus' library for differentially private training, but it does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | The target models are trained for between 50 and 200 epochs, with batch sizes varying from 4 (for very small datasets) to 1024. For optimization, we use SGD with a learning rate of 0.1, Nesterov momentum of 0.9 and a cosine learning rate schedule for the CIFAR10/100 and Image Net datasets. Smaller datasets such as the German Credit dataset also used weight decay of 1×10−4. |