On the Origins of Linear Representations in Large Language Models
Authors: Yibo Jiang, Goutham Rajendran, Pradeep Kumar Ravikumar, Bryon Aragam, Victor Veitch
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we study the origins of such linear representations. To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction. We use this formalism to show that the next token prediction objective (softmax with cross-entropy) and the implicit bias of gradient descent together promote the linear representation of concepts. Experiments show that linear representations emerge when learning from data matching the latent variable model, confirming that this simple structure already suffices to yield linear representations. We additionally confirm some predictions of the theory using the LLaMA-2 large language model, giving evidence that the simplified model yields generalizable insights. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Chicago 2Machine Learning Department, Carnegie Mellon University 3Booth School of Business, University of Chicago 4Department of Statistics, University of Chicago 5Data Science Institute, University of Chicago. |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code for the described methodology or links to a code repository. |
| Open Datasets | Yes | For simulation experiments, we first create simulated datasets by initially creating random DAGs (see Remark 2.1) with m variables/concepts. ... From these random graphical models, we can sample values of variables which are binary vectors as our datasets. ... For the unembedding concept vectors, we use the 27 concepts as described in (Park et al., 2023), which were built on top of the Big Analogy Test dataset (Gladkova et al., 2016). Examples of both datasets and a list of the 27 concepts are shown in Appendix F. ... We consider four language pairs French Spanish, French German, English French, and German Spanish from the OPUS Books dataset (Tiedemann, 2012). |
| Dataset Splits | No | The paper does not explicitly provide details about dataset splits (train/validation/test percentages or counts). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions optimizers (stochastic gradient descent, Adam) and models (LLaMA-2) but does not provide specific version numbers for software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | Unless otherwise stated, the model is trained using stochastic gradient descent with a learning rate of 0.1 and batch size 100. |