Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification
Authors: Lin Zhang, Wenshuo Dong, Zhuoran Zhang, Shu Yang, Lijie Hu, Ninghao Liu, Pan Zhou, Di Wang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate EAP-GP on six datasets using GPT-2 Small, GPT-2 Medium, and GPT-2 XL. Experimental results demonstrate that EAP-GP outperforms existing methods in circuit faithfulness, achieving improvements up to 17.7%. |
| Researcher Affiliation | Academia | 1 King Abdullah University of Science and Technology (KAUST) 2 Provable Responsible AI and Data Analytics (PRADA) Lab 3 Harbin Institute of Technology, Shenzhen 4 University of Copenhagen 5 Peking University 6 MBZUAI 7 Hong Kong Polytechnic University 8 Huazhong University of Science and Technology |
| Pseudocode | Yes | Algorithm 1 Edge Attribution Patching with Grad Path (EAP-GP) for edge (u, v) |
| Open Source Code | No | All experiments were performed using public datasets and models. Detailed information is provided in the Appendix and the experimental setup section. Additionally, we plan to release the code to further support reproducibility. |
| Open Datasets | Yes | We evaluate model performance using six datasets: Indirect Object Identification (IOI), Subject-Verb Agreement (SVA), Gender-Bias, Capital Country, Hypernymy, and Greater-Than (Hanna et al., 2024b). |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits for the input data, but rather focuses on 'edge sparsity levels' for circuit evaluation. It defers to 'Following the setting of Hanna et al. (2024b)' for experimental setup, implying that data splits are not explicitly detailed within this paper. |
| Hardware Specification | Yes | All experiments are conducted on an NVIDIA A40 GPU. |
| Software Dependencies | No | The paper mentions models like GPT-2 Small, GPT-2 Medium, and GPT-2 XL, but does not explicitly list any specific software libraries or frameworks with version numbers used for implementation. |
| Experiment Setup | Yes | All experiments are conducted on GPT-2 Small (117M), GPT-2 Medium (345M), and GPT-2 XL (1.5B), which contain 32,491, 231,877, and 2,235,025 edges, respectively. We set k = 5 in EAP-GP. Following (Hanna et al., 2024b), we perform EAP-IG with hyperparameters set to k = 5 steps. [...] we employ a greedy search strategy to iteratively evaluate the top-ranked edges iteratively, selecting the top n edges (a hyperparameter) with the highest scores to form the circuit. |