Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
Authors: Yaniv Nikankin, Dana Arad, Yossi Gandelsman, Yonatan Belinkov
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that while circuits are largely disjoint between modalities, they implement relatively similar functionalities: the differences lie primarily in processing modality-specific data positions (an image or a text sequence). Zooming in on the image data representations, we observe they become aligned with the higher-performing analogous textual representations only towards later layers, too late in processing to effectively influence subsequent positions. To overcome this, we patch the representations of visual data tokens from later layers back into earlier layers. In experiments with multiple tasks and models, this simple intervention closes a third of the performance gap between the modalities, on average. |
| Researcher Affiliation | Academia | Yaniv Nikankin1 Dana Arad1 Yossi Gandelsman2 Yonatan Belinkov1 1Technion Israel Institute of Technology 2UC Berkeley |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes methods and definitions in prose and mathematical formulations, particularly in Section 2 and Appendix A, but not in a pseudocode format. |
| Open Source Code | Yes | Code and data available at: https://github.com/technion-cs-nlp/vlm-circuits-analysis |
| Open Datasets | Yes | For the visual variant, the images are taken from the CLEVR dataset (Johnson et al., 2017). ... To demonstrate that back-patching generalizes beyond our constructed dataset in broader visual question-answering, we evaluate its impact on model accuracy using the VQAv2 (Goyal et al., 2017) and Real World QA (x AI, 2024) datasets, two common VQA benchmarks. |
| Dataset Splits | Yes | For our circuit discovery experiments (detailed in Section 2 and Section 4), we allocate each modality-specific task dataset with a 75/25 split: 75% of the prompts for discovery and 25% of the prompts for faithfulness evaluation. |
| Hardware Specification | Yes | Our experiments were conducted on an NVIDIA L40 node equipped with 8 GPUs, each containing 48GB of memory. Peak memory consumption occurred during circuit discovery operations on the Gemma-3-12B-Instruct model, that required the parallel use of 4 GPUs. |
| Software Dependencies | No | Our circuit discovery experiments use our fork of the Transformer Lens library (Nanda & Bloom, 2022), in which we implement patching code for VLMs. |
| Experiment Setup | Yes | To investigate the accuracy gap between textual and visual prompts in VLMs, we construct a dataset of five question-answering tasks, seen in Figure 2. Each task consists of a query paired with data presented in one of two analogous formats: as either an image or a text. ... To ensure we identify meaningful circuits within these models, we follow standard procedures (Mueller et al., 2025) and verify high VLM performance (substantially above-chance accuracy) on each task. We report accuracies in Appendix C and focus on the more common case where models achieve higher accuracy on textual variants compared to visual variants. |