Hypothesis Testing the Circuit Hypothesis in LLMs

Authors: Claudia Shi, Nicolas Beltran Velez, Achille Nazaret, Carolina Zheng, Adrià Garriga-Alonso, Andrew Jesson, Maggie Makar, David Blei

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply these tests to six circuits described in the research literature. We find that synthetic circuits circuits that are hard-coded in the model align with the idealized properties. Circuits discovered in Transformer models satisfy the criteria to varying degrees.
Researcher Affiliation Collaboration 1Department of Computer Science, Columbia University, New York, USA 2Computer Science and Engineering, University of Michigan, Ann Arbor, USA 3FAR AI, USA
Pseudocode Yes Algorithm 1: Tail Test
Open Source Code Yes To facilitate future empirical studies of circuits, we created the circuitry package, a wrapper around the Transformer Lens library, which abstracts away lower-level manipulations of hooks and activations. The software is available at https: //github.com/blei-lab/circuitry.
Open Datasets Yes We use the dataset provided by Wang et al. [2023] following the structure above. [...] We use the dataset provided by Conmy et al. [2023] which contains 40 sequences of 300 tokens from the validation split of Open Web Text Gokaslan and Cohen [2019] filtered to include instances of induction. [...] We use the dataset provided by Heimersheim and Janiak [2023] following the structure above.
Dataset Splits Yes We use the dataset provided by Conmy et al. [2023] which contains 40 sequences of 300 tokens from the validation split of Open Web Text Gokaslan and Cohen [2019] filtered to include instances of induction.
Hardware Specification Yes Our package is implemented efficiently, and can evaluate hundreds of circuits in a few minutes on a single A5000 GPU.
Software Dependencies No The paper mentions using the 'Transformer Lens' library and their own 'circuitry package', but it does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes We draw 100 random circuits to form the reference distribution for the sufficiency and partial necessity tests. For minimality, we draw 10, 000 random edges for G-T and IOI and 1000 random edges for the other circuits. In all experiments, we use Eq. 1 with ℓ2 norm as the faithfulness metric. We set q to be 0.9 and α to be 0.05.