Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Gate to the Vessel: Residual Experts Restore What SAM Overlooks

Authors: Weili Jiang, Jinrong Lv, Xun Gong, Xiaomeng Li, Chubin Ou

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on five public vascular segmentation datasets demonstrate that Fine SAM++ consistently outperforms both SAM-adapted baselines and task-specific models in terms of accuracy, topological consistency.
Researcher Affiliation	Academia	Weili Jiang School of Computing and Artificial Intelligence Southwest Jiaotong University EMAIL Jinrong Lv School of Computing and Artificial Intelligence Southwest Jiaotong University EMAIL Xun Gong School of Computing and Artificial Intelligence Southwest Jiaotong University EMAIL Xiaomeng Li Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology EMAIL Chubin Ou Institute of Biomedical Engineering, Peking University Shenzhen Graduate School Department of Radiology, Guangdong Provincial People s Hospital EMAIL
Pseudocode	No	The paper describes methods using text and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code, nor does it include a link to a code repository. The NeurIPS checklist states 'The paper uses publicly available data and provides sufficient explanations to faithfully reproduce the main experimental results.', which focuses on reproducibility rather than direct code access.
Open Datasets	Yes	We evaluate Fine SAM++ across five publicly available vascular segmentation datasets spanning three imaging modalities. The DRIVE dataset [23] contains two-dimensional retinal fundus images with ground truth vessel masks. ROSE [24] provides retinal vessel segmentation from 2D optical coherence tomography angiography (OCTA) scans. FIVES [25] includes 800 high-resolution multi-disease color fundus photographs annotated for vessel structures. DCA1 [26] and CHUAC [27] are coronary angiography datasets containing fluoroscopic X-ray vessel images.
Dataset Splits	Yes	DRIVE [23]. The DRIVE dataset consists of 40 retinal fundus images (584 565 pixels) for vessel segmentation. We follow the official split of 20 training and 20 testing images. ROSE [24]. The ROSE dataset contains 2D retinal optical coherence tomography angiography (OCTA) scans. We use the ROSE-1 (SVC) subset comprising 30 training and 9 testing images (304 304 pixels). FIVES [25]. The FIVES dataset (Fundus Image Vessel Segmentation) provides 800 high-resolution color fundus images (2048 2048 pixels) with pixel-level vessel annotations. The dataset is split into 600 training and 200 testing images. DCA1 [26]. The DCA1 dataset contains 134 coronary angiography images (300 300 pixels). We follow the dataset s standard split with 100 training and 34 testing images. CHUAC [27]. The CHUAC dataset consists of 30 coronary angiography images (189 189 pixels) with vessel annotations. Following [28], we split the dataset into 20 training and 10 testing images. Synapse Dataset. The dataset [29] contains 30 subjects for training and 20 subjects for testing with abdominal CT scans... Consistent with the partitioning strategy outlined in [30].
Hardware Specification	Yes	All experiments are implemented using Py Torch and trained on two NVIDIA RTX 4090 GPUs.
Software Dependencies	No	The paper mentions 'All experiments are implemented using Py Torch' but does not specify a version number for PyTorch or any other software libraries used, which is required for a reproducible description.
Experiment Setup	Yes	All experiments are implemented using Py Torch and trained on two NVIDIA RTX 4090 GPUs. Data augmentation includes random elastic deformation, rotation, scaling, and intensity jittering. For the backbone, we follow [2] and integrate Lo RA adapters into the frozen SAM encoder with a rank of 4. We adopt the Vi T-B configuration of SAM as the base encoder. For fair comparison across datasets, all images are resized to 512 512 resolution. The maximum training epoch is set to 300. We use the Adam W optimizer with β1 = 0.9, β2 = 0.999, and weight decay of 0.1. The initial learning rate is set to 5 10 5 and decayed using a cosine annealing schedule. All hyperparameters are fixed across datasets without additional tuning to ensure fair comparison and reproducibility.