How do Language Models Bind Entities in Context?

Authors: Jiahai Feng, Jacob Steinhardt

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using causal interventions, we show that LMs internal activations represent binding information by attaching binding ID vectors to corresponding entities and attributes. We further show that binding ID vectors form a continuous subspace, in which distances between binding ID vectors reflect their discernability. Overall, our results uncover interpretable strategies in LMs for representing symbolic knowledge in-context, providing a step towards understanding general in-context reasoning in large-scale LMs.
Researcher Affiliation Academia Jiahai Feng & Jacob Steinhardt UC Berkeley
Pseudocode No No pseudocode or algorithm blocks were found in the paper. The methods are described in prose and using mathematical formulas.
Open Source Code Yes We release code and datasets here: https://github.com/jiahai-feng/binding-iclr
Open Datasets No No explicit training dataset splits are provided as the experiments are conducted on pre-trained language models. The paper mentions sampling N=100 contexts for evaluation purposes, but this is not a training split for the models themselves.
Dataset Splits No No explicit validation dataset splits are provided as the experiments are conducted on pre-trained language models. The paper discusses 'median-calibrated accuracy' as an evaluation metric, not a validation split.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided. The paper only mentions the sizes of the language models used (e.g., 'LLa MA 30-b', 'LLa MA-13b').
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) are mentioned. The paper refers to model families like LLa MA and Pythia.
Experiment Setup Yes In our experiments, we fix n = 2 and use 500 samples to estimate E(1) and A(1). We use LLa MA 30-b here and elsewhere unless otherwise stated.