DAGER: Exact Gradient Inversion for Large Language Models

Authors: Ivo Petrov, Dimitar I. Dimitrov, Maximilian Baader, Mark Müller, Martin Vechev

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide an efficient GPU implementation of DAGER and show experimentally that it recovers full batches of size up to 128 on large language models (LLMs), beating prior attacks in speed (20x at same batch size), scalability (10x larger batches), and reconstruction quality (ROUGE-1/2 > 0.99).
Researcher Affiliation Collaboration Ivo Petrov 1, Dimitar I. Dimitrov 1,2, Maximilian Baader2, Mark Niklas Müller2,3, Martin Vechev1,2 1 INSAIT, Sofia University "St. Kliment Ohridski" 2 ETH Zurich 3 Logic Star.ai
Pseudocode Yes Algorithm 1 Recovering Individual Tokens Algorithm 2 DAGER for Decoders Algorithm 3 DAGER for Encoders
Open Source Code Yes We provide an efficient GPU implementation of DAGER, that can be publicly accessed at https://github.com/insait-institute/dager-gradient-inversion.
Open Datasets Yes We evaluate DAGER on both encoderand decoder-based models including BERT [14], GPT-2 [12], and variations of LLa Ma [13, 37]. We consider three sentiment analysis datasets Co LA [15], SST2 [16], and Rotten Tomatoes (RT) [17]... Additionally, we consider the ECHR [18] dataset
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits for its own experimental setup, although it mentions evaluating over "batches" from standard datasets.
Hardware Specification Yes Tests on the LLa Ma-2 (7B) architecture were performed on NVIDIA A100 Tensor Core GPUs, which boast 40 GB of memory, while all others were ran on NVIDIA L4 GPUs with 24 GB of memory.
Software Dependencies No The paper mentions "We implement DAGER in Py Torch [44]" but does not specify the version number of PyTorch or any other software dependencies.
Experiment Setup Yes Hyperparameter details We use a span check acceptance threshold of τ1 = 10 5 in the first layer, and τ2 = 10 3 in the second, a rank truncation of b = 20, and for decoder-based models consider at most 10 000 000 proposal sentences per recovered EOS token position. We consider pre-trained models with a randomly initialized classification head using a normal distribution with σ = 10 3. To manage numerical instabilities within the framework, we tweak the eigenvalue threshold when doing the SVD τ rank l and decrease with the batch size growing, varying it between 10 7 and 10 9.