LAMP: Extracting Text from Gradients with Language Model Priors
Authors: Mislav Balunovic, Dimitar Dimitrov, Nikola Jovanović, Martin Vechev
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that LAMP is significantly more effective than prior work: it reconstructs 5x more bigrams and 23% longer subsequences on average. Moreover, we are the first to recover inputs from batch sizes larger than 1 for textual models. These findings indicate that gradient updates of models operating on textual data leak more information than previously thought. |
| Researcher Affiliation | Academia | Mislav Balunovi c , Dimitar I. Dimitrov , Nikola Jovanovi c, Martin Vechev {mislav.balunovic,dimitar.iliev.dimitrov, nikola.jovanovic,martin.vechev}@inf.ethz.ch Department of Computer Science ETH Zurich |
| Pseudocode | Yes | Algorithm 1 Extracting text with LAMP |
| Open Source Code | Yes | We make our code publicly available at https://github.com/eth-sri/lamp. |
| Open Datasets | Yes | To this end, in our experiments we consider three binary classification datasets of increasing complexity: Co LA [37] and SST-2 [33] from GLUE [36] with typical sequence lengths between 5 and 9 words, and 3 and 13 words, respectively, and Rotten Tomatoes [26] with typical sequence lengths between 14 and 27 words. |
| Dataset Splits | No | The paper states that models were fine-tuned and evaluation was performed on "100 random sequences from the respective training sets", but it does not specify the train/validation/test splits for the datasets used in the fine-tuning process. |
| Hardware Specification | No | The paper states in Appendix E: "All experiments were run on a server with 8 GPUs." but does not specify the manufacturer, model, or other detailed specifications of these GPUs or any other hardware components. |
| Software Dependencies | No | The paper mentions software like "Pytorch [27]" and models from "Hugging Face [39]" but does not specify exact version numbers for these software dependencies, which would be required for reproducible setup. |
| Experiment Setup | Yes | For the BERTBASE and Tiny BERT6 experiments, we run our attack with it = 30, nc = 75 and nd = 200, and stop the optimization early once we reach a total of 2000 continuous optimization steps. For the BERTLARGE model, whose number of parameters make the optimization harder, we use it = 25 and nc = 200 instead, resulting in 5000 continuous optimization steps. We run DLG and TAG for 10 000 optimization steps on BERTLARGE and 2500 on all other models. For the continuous optimization, we use Adam [17] with a learning rate decay factor γ applied every 50 steps for all methods and experiments, except for BERTLARGE ones where, following Geiping et al. [8], we use Adam W [21] and linear learning rate decay schedule applied every step. |