Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluation conducted on three datasets and four exemplary LLMs shows that SPV-MIA raises the AUC of MIAs from 0.7 to a significantly high level of 0.9. Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA. |
| Researcher Affiliation | Academia | Wenjie Fu Huazhong University of Science and Technology wjfu99@outlook.com Huandong Wang Tsinghua University wanghuandong@tsinghua.edu.cn Chen Gao Tsinghua University chgao96@gmail.com Guanghua Liu Huazhong University of Science and Technology guanghualiu@hust.edu.cn Yong Li Tsinghua University liyong07@tsinghua.edu.cn Tao Jiang Huazhong University of Science and Technology taojiang@hust.edu.cn |
| Pseudocode | Yes | We provide the detailed pseudo codes of both two paraphrasing models in Appendix A.3. (See Appendix A.3, Algorithm 1 and Algorithm 2) |
| Open Source Code | Yes | Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA. |
| Open Datasets | Yes | Our experiments are conducted on four open-source LLMs: GPT-2 [54], GPT-J [69], Falcon-7B [3] and LLa MA-7B [65], which are both fine-tuned over three dataset across multiple domains and LLM use cases: Wikitext-103 [43], AG News [78] and XSum [49]. |
| Dataset Splits | No | The paper mentions a "validation set" in the context of early stopping and perplexity (PPL), but it does not provide specific details on the dataset split for validation, such as percentages or sample counts distinct from training and test sets. |
| Hardware Specification | Yes | All experiments are compiled and tested on a Linux server (CPU: AMD EPYC-7763, GPU: NVIDIA Ge Force RTX 3090) |
| Software Dependencies | No | The paper mentions using techniques and models like Lo RA [24], Adam W optimizer [40], and T5-base, but it does not specify explicit version numbers for any software dependencies, libraries, or frameworks used for replication. |
| Experiment Setup | Yes | Each target LLM is finetuned with the batch size of 16, and trained for 10 epochs. Each self-prompt reference model is trained for 4 epochs. We adopt Lo RA [24] as the default Parameter-Efficient Fine-Tuning (PEFT) technique. The learning rate is set to 0.0001. We adopt the Adam W optimizer [40] and early stopping [71]. |