Function Vectors in Large Language Models
Authors: Eric Todd, Millicent Li, Arnab Sen Sharma, Aaron Mueller, Byron C Wallace, David Bau
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are robust to changes in context, i.e., they trigger execution of the task on inputs such as zero-shot and natural text settings that do not resemble the ICL contexts from which they are collected. We test FVs across a range of tasks, models, and layers and find strong causal effects across settings in middle layers. We investigate the internal structure of FVs and find while that they often contain information that encodes the output space of the function, this information alone is not sufficient to reconstruct an FV. Finally, we test semantic vector composition in FVs, and find that to some extent they can be summed to create vectors that trigger new complex tasks. Our findings show that compact, causal internal vector representations of function abstractions can be explicitly extracted from LLMs. |
| Researcher Affiliation | Academia | Eric Todd , Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, and David Bau Khoury College of Computer Sciences, Northeastern University |
| Pseudocode | No | The paper contains mathematical formulations and descriptions of procedures, but no clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Open-source code and data available at functions.baulab.info. |
| Open Datasets | Yes | Our antonym and synonym datasets are based on data taken from Nguyen et al. (2017). ... We construct our language translation datasets English-French, English-German, and English-Spanish using data from Conneau et al. (2017)... Our sentiment analysis dataset is derived from the Stanford Sentiment Treebank (SST-2) Socher et al. (2013)... In our experiments, we use a subset of the CoNLL-2003 English named entity recognition (NER) dataset Sang & De Meulder (2003)... |
| Dataset Splits | No | The paper uses in-context learning with 'shots' as examples and evaluates on a 'test set', but does not specify traditional training/validation/test splits for a global dataset used in their methodology, nor explicit validation splits. |
| Hardware Specification | No | The paper mentions models used and states 'We thank the Center for AI Safety (CAIS) for making computing resources available for this research', but does not specify any particular CPU, GPU, or other hardware details. |
| Software Dependencies | No | The paper mentions 'We use huggingface implementations (Wolf et al., 2020) of each model', but does not provide specific version numbers for Huggingface or other software dependencies. |
| Experiment Setup | Yes | We compute a function vector as the sum over the average output of several attention heads, where the average is conditioned on prompts taken from a particular task. We write this as vt = P aℓj A at ℓj. ... We use |Pt| = 100 clean (uncorrupted) 10-shot prompts. ... The AIE is computed over a subset of all abstractive tasks (Appendix E), using | Pt| = 25 corrupted 10-shot prompts per task. ... For GPT-J, we use |A| = 10 attention heads. For larger models, we scale the number of attention heads we use approximately proportionally to the number of attention heads in the model. (We use 20 heads for Llama 2 (7B), 50 for Llama 2 (13B) & GPT-Neo X, and 100 for Llama 2 (70B).) ... For all results in Section 3 and in the following appendix sections (unless stated otherwise e.g. Figure 4), we add the FV to the hidden state at layer ℓ |L|/3, which we found works well in practice. This corresponds to layer 9 for GPT-J, layer 15 for GPT-Neo X, layer 11 for Llama 2 (7B), layer 14 for Llama 2 (13B) and layer 26 for Llama 2 (70B). |