reproducibilityindex.ai

LAMP: Extracting Text from Gradients with Language Model Priors

Authors: Mislav Balunovic, Dimitar Dimitrov, Nikola Jovanović, Martin Vechev

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that LAMP is signiﬁcantly more effective than prior work: it reconstructs 5x more bigrams and 23% longer subsequences on average. Moreover, we are the ﬁrst to recover inputs from batch sizes larger than 1 for textual models. These ﬁndings indicate that gradient updates of models operating on textual data leak more information than previously thought.
Researcher Affiliation	Academia	Mislav Balunovi c , Dimitar I. Dimitrov , Nikola Jovanovi c, Martin Vechev {mislav.balunovic,dimitar.iliev.dimitrov, nikola.jovanovic,martin.vechev}@inf.ethz.ch Department of Computer Science ETH Zurich
Pseudocode	Yes	Algorithm 1 Extracting text with LAMP
Open Source Code	Yes	We make our code publicly available at https://github.com/eth-sri/lamp.
Open Datasets	Yes	To this end, in our experiments we consider three binary classiﬁcation datasets of increasing complexity: Co LA [37] and SST-2 [33] from GLUE [36] with typical sequence lengths between 5 and 9 words, and 3 and 13 words, respectively, and Rotten Tomatoes [26] with typical sequence lengths between 14 and 27 words.
Dataset Splits	No	The paper states that models were fine-tuned and evaluation was performed on "100 random sequences from the respective training sets", but it does not specify the train/validation/test splits for the datasets used in the fine-tuning process.
Hardware Specification	No	The paper states in Appendix E: "All experiments were run on a server with 8 GPUs." but does not specify the manufacturer, model, or other detailed specifications of these GPUs or any other hardware components.
Software Dependencies	No	The paper mentions software like "Pytorch [27]" and models from "Hugging Face [39]" but does not specify exact version numbers for these software dependencies, which would be required for reproducible setup.
Experiment Setup	Yes	For the BERTBASE and Tiny BERT6 experiments, we run our attack with it = 30, nc = 75 and nd = 200, and stop the optimization early once we reach a total of 2000 continuous optimization steps. For the BERTLARGE model, whose number of parameters make the optimization harder, we use it = 25 and nc = 200 instead, resulting in 5000 continuous optimization steps. We run DLG and TAG for 10 000 optimization steps on BERTLARGE and 2500 on all other models. For the continuous optimization, we use Adam [17] with a learning rate decay factor γ applied every 50 steps for all methods and experiments, except for BERTLARGE ones where, following Geiping et al. [8], we use Adam W [21] and linear learning rate decay schedule applied every step.