DAGER: Exact Gradient Inversion for Large Language Models
Authors: Ivo Petrov, Dimitar I. Dimitrov, Maximilian Baader, Mark Müller, Martin Vechev
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide an efficient GPU implementation of DAGER and show experimentally that it recovers full batches of size up to 128 on large language models (LLMs), beating prior attacks in speed (20x at same batch size), scalability (10x larger batches), and reconstruction quality (ROUGE-1/2 > 0.99). |
| Researcher Affiliation | Collaboration | Ivo Petrov 1, Dimitar I. Dimitrov 1,2, Maximilian Baader2, Mark Niklas Müller2,3, Martin Vechev1,2 1 INSAIT, Sofia University "St. Kliment Ohridski" 2 ETH Zurich 3 Logic Star.ai |
| Pseudocode | Yes | Algorithm 1 Recovering Individual Tokens Algorithm 2 DAGER for Decoders Algorithm 3 DAGER for Encoders |
| Open Source Code | Yes | We provide an efficient GPU implementation of DAGER, that can be publicly accessed at https://github.com/insait-institute/dager-gradient-inversion. |
| Open Datasets | Yes | We evaluate DAGER on both encoderand decoder-based models including BERT [14], GPT-2 [12], and variations of LLa Ma [13, 37]. We consider three sentiment analysis datasets Co LA [15], SST2 [16], and Rotten Tomatoes (RT) [17]... Additionally, we consider the ECHR [18] dataset |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits for its own experimental setup, although it mentions evaluating over "batches" from standard datasets. |
| Hardware Specification | Yes | Tests on the LLa Ma-2 (7B) architecture were performed on NVIDIA A100 Tensor Core GPUs, which boast 40 GB of memory, while all others were ran on NVIDIA L4 GPUs with 24 GB of memory. |
| Software Dependencies | No | The paper mentions "We implement DAGER in Py Torch [44]" but does not specify the version number of PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Hyperparameter details We use a span check acceptance threshold of τ1 = 10 5 in the first layer, and τ2 = 10 3 in the second, a rank truncation of b = 20, and for decoder-based models consider at most 10 000 000 proposal sentences per recovered EOS token position. We consider pre-trained models with a randomly initialized classification head using a normal distribution with σ = 10 3. To manage numerical instabilities within the framework, we tweak the eigenvalue threshold when doing the SVD τ rank l and decrease with the batch size growing, varying it between 10 7 and 10 9. |