reproducibilityindex.ai

Knowledge Circuits in Pretrained Transformers

Authors: Yunzhi Yao, Ningyu Zhang, Zekun Xi, Mengru Wang, Ziwen Xu, Shumin Deng, Huajun Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments, conducted with GPT2 and Tiny LLAMA, have allowed us to observe how certain information heads, relation heads, and Multilayer Perceptrons collaboratively encode knowledge within the model.
Researcher Affiliation	Collaboration	1 Zhejiang University 2 National University of Singapore, NUS-NCS Joint Lab, Singapore 3 Zhejiang Key Laboratory of Big Data Intelligent Computing
Pseudocode	No	The paper describes methods and processes in textual form and through mathematical equations but does not include any dedicated pseudocode blocks or algorithm listings.
Open Source Code	Yes	Code and data are available in https://github.com/zjunlp/Knowledge Circuits.
Open Datasets	Yes	We utilize the dataset provided by LRE [42] and consider different kinds of knowledge, including linguistic, commonsense, fact, and bias.
Dataset Splits	Yes	To evaluate completeness, we first construct the circuit using the validation data Dval for a specific knowledge type and then test its performance on the test split Dtest in isolation.
Hardware Specification	Yes	We use the NVIDIA-A800 (40GB) to conduct our experiments.
Software Dependencies	No	The paper mentions using specific toolkits like "Automated Circuit Discovery [32] toolkit" and "transformer lens [41]", and frameworks like "Easy Edit[74]", but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The primary hyperparameter for constructing a circuit is the threshold τ used to detect performance drops... In our experiment, we test τ values from the set {0.02, 0.01, 0.005} to determine the appropriate circuit size for different types of knowledge.