Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual Decoding

Authors: Haonan Wang, Jingyu Lu, Hongrui Li, Xiaomeng Li

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that ZEBRA significantly outperforms zero-shot baselines and achieves performance comparable to fully finetuned models on several metrics. Quantitative Results. We evaluate ZEBRA against representative methods across various training regimes on the Natural Scenes Dataset, with results averaged over subjects 1, 2, 5, and 7. We conduct ablation studies on Subject 1 (trained on Subjects 2-8) to assess the contribution of each component in ZEBRA.
Researcher Affiliation Academia Haonan Wang, Jingyu Lu, Hongrui Li, Xiaomeng Li The Hong Kong University of Science and Technology EMAIL, EMAIL
Pseudocode No The paper describes methods in text and uses figures (e.g., Figure 2: Core idea of ZEBRA, Figure 3: ZEBRA consists of two key components) to illustrate the architecture and flow, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code and model weights are available at: https://github.com/xmed-lab/ZEBRA.
Open Datasets Yes Dataset. We use the Natural Scenes Dataset (NSD) [17] for both training and evaluation. NSD contains visual image stimulus and corresponding f MRI recordings of 8 subjects, with each subject viewing 8,000-9,000 images. [17] E. J. Allen, G. St-Yves, Y. Wu, J. L. Breedlove, J. S. Prince, L. T. Dowdle, M. Nau, B. Caron, F. Pestilli, I. Charest, et al., A massive 7t fmri dataset to bridge cognitive neuroscience and artificial intelligence, Nature neuroscience, vol. 25, no. 1, pp. 116 126, 2022.
Dataset Splits Yes For each test subject, we use all other 7 subjects to train the model and tested on the unseen subject with unseen test split. The final results were tested on subjects 1, 2, 5 or 7, since these subjects complete all scanning sessions, sharing the same 982 images as testing data.
Hardware Specification Yes All experiments were conducted for 60 epochs using 8 NVIDIA RTX H800 GPUs with a total batch size of 128 (16 samples per GPU).
Software Dependencies No The paper mentions using specific optimizers like Adam W and generative models like SDXL un CLIP, but does not provide specific version numbers for software libraries or frameworks such as Python, PyTorch, or TensorFlow.
Experiment Setup Yes All experiments were conducted for 60 epochs using 8 NVIDIA RTX H800 GPUs with a total batch size of 128 (16 samples per GPU). We adopt the Adam W optimizer [48] with a learning rate of 1e-4, following the One Cycle learning rate schedule [49]. In the inference stage, we follow Mind Eye2 s two-stage decoding process. First, the predicted image latents are decoded into coarse images using SDXL un CLIP. These coarse outputs are then refined using base SDXL in image-to-image mode, guided by predicted captions. The refinement starts from a noised version of the coarse image, skipping the first 50% of diffusion steps. ϑ is set to 30 following previous methods [4].