Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Authors: Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, Nicholas Carlini
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on various datasets and models, including both vision-language models (CLIP) and large language models, demonstrating the broad applicability and effectiveness of such an attack. Additionally, we carry out multiple ablation studies with different fine-tuning methods and inference strategies to thoroughly analyze this new threat. |
| Researcher Affiliation | Collaboration | 1University of Maryland, College Park 2Oregon State University 3ELLIS Institute, MPI for Intelligent Systems 4Google Deep Mind |
| Pseudocode | No | The paper describes the attack mechanism and experimental procedures in narrative text and mathematical equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | All the models and datasets used in this paper are open-sourced, and we include our code in the supplemental material. |
| Open Datasets | Yes | We present our experimental results, averaged over 5 random seeds, on datasets including Image Net (Deng et al., 2009), CIFAR-10 (Krizhevsky and Hinton, 2009), and CIFAR-100 (Krizhevsky and Hinton, 2009). Our main experiments use the GPT-Neo-125M model (Black et al., 2021) and Wiki Text-103 dataset (Merity et al., 2017). We inject 1, 000 randomly selected canaries from ai4Privacy (2023)... We employ MIMIC-IV (Johnson et al., 2023) for fine-tuning. |
| Dataset Splits | No | During fine-tuning, following the hyper-parameters from Wortsman et al. (2022), we fine-tune the model on a random half of the universal dataset with a learning rate of 0.00003 over 5 epochs. During the poisoning phase, the validation set serves as Daux. |
| Hardware Specification | Yes | Most of our computing resources are allocated to fine-tuning models, utilizing up to four RTX A4000 GPUs at the same time. |
| Software Dependencies | No | The paper mentions optimizers like Adam W and various models (e.g., CLIP, GPT-Neo), but it does not specify versions for software libraries, frameworks, or programming languages (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | For the poisoning phase, we set α = 0.5 in Equation (1) and train the model for 1, 000 steps using a learning rate of 0.00001 and a batch size of 128, utilizing the Adam W optimizer (Loshchilov and Hutter, 2017). During fine-tuning, following the hyper-parameters from Wortsman et al. (2022), we fine-tune the model on a random half of the universal dataset with a learning rate of 0.00003 over 5 epochs. |