Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Optical Coherence Tomography Harmonization with Anatomy-Guided Latent Metric Schrödinger Bridges

Authors: Shuwen Wei, Samuel Remedios, Blake Dewey, Zhangxing Bian, Shimeng Wang, Junyu Chen, Bruno Jedynak, shiv saidha, Peter Calabresi, Aaron Carass, Jerry L Prince

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments Dataset The OCT dataset consists of 388 Cirrus volumes and 338 Spectralis volumes. The 388 Cirrus volumes come from 194 subjects (388 eyes in total). The 338 Spectralis volumes come from 165 subjects (269 eyes in total). Anatomy Consistency Comparison We compared the performance of our proposed LMSB with the DSBM and the DDIB by evaluating the anatomical consistency before and after harmonization. To do so, we ran all the methods on the testing dataset that contains 1, 500 Cirrus B-scans and 1, 500 Spectralis OCT B-scans to generate Cirrus B-scans from Spectralis B-scans and vice versa as two harmonization tasks. We then applied the deep learning based retinal OCT segmentation method (He et al., 2019, 2021, 2023) to identify nine retinal boundaries. We computed the mean absolute error (MAE) for these boundary locations before and after harmonization. The results are summarized in Table 1.
Researcher Affiliation	Academia	1Johns Hopkins University 2Johns Hopkins School of Medicine 3Portland State University Corresponding author: EMAIL
Pseudocode	No	The paper describes the methods in detailed text and uses diagrams like Figure 3 to illustrate network architecture, but it does not include explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Corresponding author: EMAIL code: https://github.com/Shuwen-Wei/lmsb
Open Datasets	No	Dataset The OCT dataset consists of 388 Cirrus volumes and 338 Spectralis volumes. The 388 Cirrus volumes come from 194 subjects (388 eyes in total). The 338 Spectralis volumes come from 165 subjects (269 eyes in total). ... Our data cannot be shared due to patient privacy issues
Dataset Splits	Yes	Training and testing splits are done by subject: 352 Cirrus volumes from 176 subjects (352 eyes) and 307 Spectralis volumes from 156 subjects (252 eyes) for training and 36 Cirrus volumes from 18 subjects (36 eyes) and 31 Spectralis volumes from 9 subjects (17 eyes) for testing. Our experiments are in 2D and operate on B-scans independently. Because each Cirrus volume contains 128 B-scans and each Spectralis OCT volume contains 49 B-scans over the same field of view 6 6 mm2, the Cirrus volumes have denser B-scan sampling. Therefore, to reduce anatomical redundancy across individual Cirrus volume B-scans, we extract every third B-scan. Specifically, in the training dataset, we extract 15, 000 B-scans in the training dataset and 1, 500 B-scans in the testing dataset for both Cirrus and Spectralis, resulting in 30, 000 training B-scans and 3, 000 testing B-scans with no subject data leakage between train and test splits.
Hardware Specification	Yes	The proposed model LMSB requires only a single GPU with 48 GB of memory for training and 15 GB for evaluation when the batch size is 16. However, in practice we used several different GPUs in parallel to train and test different models with different hyperparameters. To train all the models including the proposed model, comparison models and ablation models, we used seven GPUs in total, including four NVIDIA A40 (48 GB), two NVIDIA RTX A6000 (48 GB), and one QUADRO RTX 8000 (48 GB).
Software Dependencies	No	The paper does not explicitly list specific software dependencies with version numbers. It cites "normflows: A Py Torch Package for Normalizing Flows" but does not state the PyTorch version used for their own work or other key software versions.
Experiment Setup	Yes	Network Training We trained the invertible neural network f with 15, 000 Cirrus and 15, 000 Spectralis OCT B-scans in the training set. ... We ran 30 steps of IMF in total to solve DSBM and LMSB. For the first IMF step, we used independent coupling as the intial coupling, and we ran 10, 000 iterations both forward and backward. For the remaining IMF steps, we ran 2, 500 iterations both forward and backward, and we cached simulation trajectories every 1, 250 iterations. We set the number of diffusion steps to 100. We let the diffusion coefficient be constant at each diffusion step and choose σ2 = 0.1. After training, we compared their performance with two sampling strategies: 1) stochastic differential equation (SDE); and 2) ordinary differential equation (ODE). ... In practice, we do not go through all 1, 000 diffusion steps, but stop early at 500 steps for a balance between sampling quality and harmonization quality (Wei et al., 2026).