Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration

Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive evaluation conducted on three datasets and four exemplary LLMs shows that SPV-MIA raises the AUC of MIAs from 0.7 to a significantly high level of 0.9. Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA.
Researcher Affiliation Academia Wenjie Fu Huazhong University of Science and Technology wjfu99@outlook.com Huandong Wang Tsinghua University wanghuandong@tsinghua.edu.cn Chen Gao Tsinghua University chgao96@gmail.com Guanghua Liu Huazhong University of Science and Technology guanghualiu@hust.edu.cn Yong Li Tsinghua University liyong07@tsinghua.edu.cn Tao Jiang Huazhong University of Science and Technology taojiang@hust.edu.cn
Pseudocode Yes We provide the detailed pseudo codes of both two paraphrasing models in Appendix A.3. (See Appendix A.3, Algorithm 1 and Algorithm 2)
Open Source Code Yes Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA.
Open Datasets Yes Our experiments are conducted on four open-source LLMs: GPT-2 [54], GPT-J [69], Falcon-7B [3] and LLa MA-7B [65], which are both fine-tuned over three dataset across multiple domains and LLM use cases: Wikitext-103 [43], AG News [78] and XSum [49].
Dataset Splits No The paper mentions a "validation set" in the context of early stopping and perplexity (PPL), but it does not provide specific details on the dataset split for validation, such as percentages or sample counts distinct from training and test sets.
Hardware Specification Yes All experiments are compiled and tested on a Linux server (CPU: AMD EPYC-7763, GPU: NVIDIA Ge Force RTX 3090)
Software Dependencies No The paper mentions using techniques and models like Lo RA [24], Adam W optimizer [40], and T5-base, but it does not specify explicit version numbers for any software dependencies, libraries, or frameworks used for replication.
Experiment Setup Yes Each target LLM is finetuned with the batch size of 16, and trained for 10 epochs. Each self-prompt reference model is trained for 4 epochs. We adopt Lo RA [24] as the default Parameter-Efficient Fine-Tuning (PEFT) technique. The learning rate is set to 0.0001. We adopt the Adam W optimizer [40] and early stopping [71].