reproducibilityindex.ai

Transformers Represent Belief State Geometry in their Residual Stream

Authors: Adam Shai, Lucas Teixeira, Alexander Oldenziel, Sarah Marzen, Paul Riechers

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To test this framework, we conduct well-controlled experiments where we train transformers on data generated from processes with hidden ground truth structure, and then use our theory to make predictions about the geometry of internal activations. Even in cases where the framework predicts highly nontrivial fractal structure, our empirical results confirm these predictions (Figure 1).
Researcher Affiliation	Collaboration	Adam S. Shai Simplex PIBBSS Sarah E. Marzen Department of Natural Sciences Pitzer and Scripps College Lucas Teixeira PIBBSS Alexander Gietelink Oldenziel University College London Timaeus Paul M. Riechers Simplex BITS
Pseudocode	No	The paper describes methods and procedures but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	We have included code in the submission that reproduces all results. It should be noted that we will continue to work on cleaning up this code for final submission (though the code very much works as is and recreates the figures in the submission).
Open Datasets	No	The paper uses data generated by specific Hidden Markov Models (HMMs) (Mess3 Process and RRXOR process) defined within the paper itself. It does not provide access information (link, DOI, specific repository, or citation to an external dataset source) for a publicly available or open dataset.
Dataset Splits	No	The paper mentions 'normalized validation loss' (Figure S2) indicating that a validation set was used, and describes a train/test split (20%/80%) for the regression analysis on activations. However, it does not explicitly provide the specific percentages or sample counts for training, validation, and test splits for the data generated by the HMMs that the transformer itself was trained on, to allow for reproduction of the data partitioning for the primary experiment.
Hardware Specification	No	The paper explicitly states in its NeurIPS checklist: 'These experiments are on small toy models and thus can be run on any modern hardware,' and therefore provides no specific hardware details for reproduction.
Software Dependencies	No	The paper mentions using the 'Transformer Lens library [22]' for analysis, but does not provide specific version numbers for this or any other software dependencies. It only mentions 'pip install -e .' for installation without listing explicit versioned dependencies.
Experiment Setup	Yes	In our experiments, we trained a transformer model using the following hyperparameters and training parameters: The model had a context window size of 10, used ReLU as the activation function, and had a head dimension of 8 and a model dimension of 64. There was 1 attention head in each of 4 layers. The model had MLPs of dimension 256 and used causal attention masking. Layer normalization was applied. For training, we used the Stochastic Gradient Descent (SGD) optimizer with a batch size of 64, running for 1,000,000 epochs, and a learning rate of 0.01 with no weight decay.