Linearity of Relation Decoding in Transformer Language Models
Authors: Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now empirically evaluate how well LREs, estimated using the approach from Section 3, can approximate relation decoding in LMs for a variety of different relations. In all of our experiments, we study autoregressive language models. |
| Researcher Affiliation | Academia | 1Massachusetts Institute of Technology, 2Northeastern University, 3Technion IIT, 4Harvard University. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and dataset are available at lre.baulab.info. |
| Open Datasets | Yes | To support our evaluation, we manually curate a dataset of 47 relations spanning four categories: factual associations, commonsense knowledge, implicit biases, and linguistic knowledge. Each relation is associated with a number of example subject object pairs (si, oi), as well as a prompt template that leads the language model to predict o when s is filled in (e.g., [s] plays the). When evaluating each model, we filter the dataset to examples where the language model correctly predicts the object o given the prompt. Table 1 summarizes the dataset and filtering results. Further details on dataset construction are in Appendix A. The code and dataset are available at lre.baulab.info. |
| Dataset Splits | No | The paper mentions evaluating on "new subjects s" and selecting hyperparameters using "grid-search," which implies an internal data split for validation. However, it does not explicitly state the proportions or counts for train/validation/test splits of the dataset. |
| Hardware Specification | Yes | We ran all experiments on workstations with 80GB NVIDIA A100 GPUs or 48GB A6000 GPUs using Hugging Face Transformers (Wolf et al., 2019) implemented in Py Torch (Paszke et al., 2019). |
| Software Dependencies | No | The paper mentions "Hugging Face Transformers (Wolf et al., 2019) implemented in Py Torch (Paszke et al., 2019)". However, it does not specify version numbers for these software components, which is necessary for reproducibility. |
| Experiment Setup | Yes | We estimate LREs for each relation using the method discussed in Section 3 with n = 8. While calculating W and b for an individual example we prepend the remaining n - 1 training examples as few-shot examples so that the LM is more likely to generate the answer o given a s under the relation r over other plausible tokens. We fix the scalar term β (from Equation (4)) once per LM. We also have two hyperparameters specific to each relation r; ℓr, the layer after which s is to be extracted; and ρr, the rank of the inverse W (to check causality as in Equation (7)). We select these hyperparameters with grid-search; see Appendix E for details. For each relation, we report average results over 24 trials with distinct sets of n examples randomly drawn from the dataset. |