OSLO: One-Shot Label-Only Membership Inference Attacks

Authors: Yuefeng Peng, Jaechul Roh, Subhransu Maji, Amir Houmansadr

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively compared OSLO with state-of-the-art label-only methods [7, 6]. Following recent standards, we evaluated the attacks using metrics such as the true positive rate (TPR) under a low false positive rate (FPR) and performed a precision/recall analysis. Our results show that previous label-only MIAs perfom poorly, exhibiting high false positive rates. In contrast, OSLO achieves high precision in identifying members, outperforming prior work by a significant margin. For example, as shown in Figure 2, OSLO is 5 to 67 more powerful than other label-only MIAs in terms of TPR under 0.1% FPR across three datasets using a Res Net18 [13] model.
Researcher Affiliation Academia Yuefeng Peng University of Massachusetts Amherst yuefengpeng@cs.umass.edu Jaechul Roh University of Massachusetts Amherst jroh@umass.edu Subhransu Maji University of Massachusetts Amherst smaji@cs.umass.edu Amir Houmansadr University of Massachusetts Amherst amir@cs.umass.edu
Pseudocode Yes Algorithm 1: Transferable adversarial example generation. Input: Benign input x with label y; source models g; validation model h; number of sub-procedures K; number of iterations N; step size α; maximum perturbation size ϵ and threshold τ; Output: Transfer-based unrestricted adversarial example x with approximately minimum change; x0 x; for k = 1, 2, . . . , K do
Open Source Code No We will release the code and provide instructions to reproduce our main results.
Open Datasets Yes Datasets. We utilized three datasets commonly used in prior works [5]: CIFAR-10 [38], CIFAR-100 [38]and SVHN [39].
Dataset Splits Yes For CIFAR-10 and CIFAR-100, we selected 25,000 samples to train the target model under attack. For SVHN, we randomly selected 2,000 samples to train the target model. ... For the evaluation of the attacks, we randomly selected 1,000 samples from both the target model s training set and the remaining unused samples as target samples. The dataset split is summarized in Table 4. Table 4: Summary of dataset splits for training and evaluation. Dataset Target model Source/validation model Target training training samples CIFAR-10 25,000 25,000 1,000 (members) + 1,000 (non-members) CIFAR-100 25,000 25,000 1,000 (members) + 1,000 (non-members) SVHN 5,000 5,000 1,000 (members) + 1,000 (non-members)
Hardware Specification Yes All our models and methods are implemented in Py Torch. Our experiments are conducted on an NVIDIA Ge Force RTX 2080 Ti with 20 GB of memory.
Software Dependencies No All our models and methods are implemented in Py Torch.
Experiment Setup Yes In our default setup, all models are trained for 100 epochs using the Adam optimizer, with a learning rate of 0.001. We apply an L2 weight decay coefficient of 10 6 and use a batch size of 128.