Language Model Inversion
Authors: John Xavier Morris, Wenting Zhao, Justin T Chiu, Vitaly Shmatikov, Alexander M Rush
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On Llama-2 7B, our inversion method reconstructs prompts with a BLEU of 59 and token-level F1 of 78 and recovers 27% of prompts exactly.1 |
| Researcher Affiliation | Academia | John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, Alexander M. Rush Department of Computer Science Cornell University |
| Pseudocode | Yes | Algorithm 1 Logit Extraction via Binary Search for each word i |
| Open Source Code | Yes | Code for reproducing all experiments is available at github.com/jxmorris12/vec2text. |
| Open Datasets | Yes | Our dataset of prompts will be provided upon paper publication.Code for reproducing all experiments is available at https://github.com/jxmorris12/vec2text. Our dataset of prompts is available online and automatically downloaded from Hugging Face datasets. (It also details the composition of Instructions-2M dataset from public sources). |
| Dataset Splits | No | We randomly hold out 1% of the training data for testing. (This specifies a test split and implies a training split, but does not explicitly detail a validation split or provide comprehensive train/test/validation splits.) |
| Hardware Specification | No | Thanks to the Allen Institute for AI for providing the compute required to train the LLAMA inversion models. (This refers to the original LLAMA models, not the hardware used by the authors for their inversion model experiments, and no specific hardware models are mentioned for their own experiments.) |
| Software Dependencies | No | We train in bfloat16 precision. and We parameterize the inversion model using the method described in Section 4 and select T5-base (Raffel et al., 2020) as our encoder-decoder backbone, which has 222M parameters. and We train models for 100 epochs with Adam optimizer with a learning rate of 2e 4. (No specific version numbers for software libraries or frameworks are provided.) |
| Experiment Setup | Yes | We set the maximum sequence length to 64 for all experiments. We train models for 100 epochs with Adam optimizer with a learning rate of 2e 4. We use a constant learning rate with linear warmup over the first 25, 000 training steps. We train in bfloat16 precision. |