Recovering Private Text in Federated Learning of Language Models

Authors: Samyak Gupta, Yangsibo Huang, Zexuan Zhong, Tianyu Gao, Kai Li, Danqi Chen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experiments Model and datasets. We evaluate the proposed attack with the GPT-2 base (117M parameters) model (Radford et al., 2019) on two language modeling datasets, including Wiki Text-103 (Merity et al., 2017) and the Enron Email dataset (Klimt & Yang, 2004). Both datasets are publicly available for research uses. and Evaluation metrics. We use the following metrics to evaluate the attack performance: (a) ROUGE (Lin, 2004)... (b) We also propose to use named entity recovery ratio (NERR)...
Researcher Affiliation Academia Samyak Gupta Princeton University samyakg@cs.princeton.edu Yangsibo Huang Princeton University yangsibo@princeton.edu Zexuan Zhong Princeton University zzhong@cs.princeton.edu Tianyu Gao Princeton University tianyug@cs.princeton.edu Kai Li Princeton University li@cs.princeton.edu Danqi Chen Princeton University danqic@cs.princeton.edu
Pseudocode Yes We provide a detailed algorithm in Appendix A.
Open Source Code Yes Our code is publicly available at https://github.com/Princeton-SysML/FILM.
Open Datasets Yes We evaluate the proposed attack with the GPT-2 base (117M parameters) model (Radford et al., 2019) on two language modeling datasets, including Wiki Text-103 (Merity et al., 2017) and the Enron Email dataset (Klimt & Yang, 2004). Both datasets are publicly available for research uses.
Dataset Splits No The paper states "All models were trained using early stopping, i.e., models were trained until the loss of the model on the evaluation set increased." which implies an evaluation/validation set, but it does not specify any explicit train/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification Yes We note that the running time of our algorithm is quite fast, and we can recover a single sentence in under a minute using an Nvidia 2080TI GPU.
Software Dependencies No The paper mentions using the "GPT-2 model" and implies programming for it but does not specify any software dependencies with version numbers (e.g., "PyTorch 1.9", "Python 3.8").
Experiment Setup Yes Unless otherwise noted, we train the model on these sentences for 90, 000 iterations using an initial learning rate of 1 10 5, with a linearly decayed learning rate scheduler. All models were trained using early stopping, i.e., models were trained until the loss of the model on the evaluation set increased. and Our experiments demonstrate high-fidelity recovery of a single sentence feasible, and recovery of significant portions of sentences for training batches of up to 128 sentences. and We analyze the attack performance with different batch sizes, the number of training data points, and the number of training epochs.