Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
Authors: Wenjie Fu, Huandong Wang, Chen Gao, Guanghua Liu, Yong Li, Tao Jiang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluation conducted on three datasets and four exemplary LLMs shows that SPV-MIA raises the AUC of MIAs from 0.7 to a significantly high level of 0.9. Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA. |
| Researcher Affiliation | Academia | Wenjie Fu Huazhong University of Science and Technology EMAIL Huandong Wang Tsinghua University EMAIL Chen Gao Tsinghua University EMAIL Guanghua Liu Huazhong University of Science and Technology EMAIL Yong Li Tsinghua University EMAIL Tao Jiang Huazhong University of Science and Technology EMAIL |
| Pseudocode | Yes | We provide the detailed pseudo codes of both two paraphrasing models in Appendix A.3. (See Appendix A.3, Algorithm 1 and Algorithm 2) |
| Open Source Code | Yes | Our code and dataset are available at: https://github.com/tsinghua-fib-lab/NeurIPS2024_SPV-MIA. |
| Open Datasets | Yes | Our experiments are conducted on four open-source LLMs: GPT-2 [54], GPT-J [69], Falcon-7B [3] and LLa MA-7B [65], which are both fine-tuned over three dataset across multiple domains and LLM use cases: Wikitext-103 [43], AG News [78] and XSum [49]. |
| Dataset Splits | No | The paper mentions a "validation set" in the context of early stopping and perplexity (PPL), but it does not provide specific details on the dataset split for validation, such as percentages or sample counts distinct from training and test sets. |
| Hardware Specification | Yes | All experiments are compiled and tested on a Linux server (CPU: AMD EPYC-7763, GPU: NVIDIA Ge Force RTX 3090) |
| Software Dependencies | No | The paper mentions using techniques and models like Lo RA [24], Adam W optimizer [40], and T5-base, but it does not specify explicit version numbers for any software dependencies, libraries, or frameworks used for replication. |
| Experiment Setup | Yes | Each target LLM is finetuned with the batch size of 16, and trained for 10 epochs. Each self-prompt reference model is trained for 4 epochs. We adopt Lo RA [24] as the default Parameter-Efficient Fine-Tuning (PEFT) technique. The learning rate is set to 0.0001. We adopt the Adam W optimizer [40] and early stopping [71]. |