Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Orochi: Versatile Biomedical Image Processor

Authors: Gaole Dai, Chenghao Zhou, Yu Zhou, Rongyu Zhang, Yuan Zhang, Chengkai Hou, Tiejun Huang, Jianxu Chen, Shanghang Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted comprehensive comparisons strictly following the setups in published specialist models (Uni FMIR [6], VCM [5], Transmorph [32], and BSAFusion [25], see Appendix A.4 for details). Resulting in more than 30 state-of-the-art baselines across multiple benchmarks for various biomedical image-processing tasks to demonstrate the effectiveness and versatility of Orochi.
Researcher Affiliation	Academia	1 State Key Laboratory of Multimedia Information Processing, School of Computer Science, Peking University 2 Academy for Advanced Interdisciplinary Studies, Peking University 3 Leibniz-Institut für Analytische Wissenschaften ISAS e.V. EMAIL EMAIL
Pseudocode	No	The paper describes the methodologies like Random Multi-scale Sampling and Task-related Joint-embedding Pre-training using descriptive text and mathematical formulations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our pre-trained weights and code will be released. A Readme.md file is attached along with the code submitted in supplemental material.
Open Datasets	Yes	A combined multi-modal biomedical image dataset aggregated from over 100 public studies, encompassing various imaging modalities and degradation types [51, 50, 49]. ... The OASIS brain MRI dataset from the Learn2Reg 2021 challenge... [60, 58]. ... The Harvard Whole Brain Atlas (HBA)... [57]. ... The CARE microscopy image dataset... [4].
Dataset Splits	Yes	We conducted comprehensive comparisons strictly following the setups in published specialist models (Uni FMIR [6], VCM [5], Transmorph [32], and BSAFusion [25], see Appendix A.4 for details). ... The OASIS brain MRI dataset from the Learn2Reg 2021 challenge, used to evaluate the overlap of segmented regions and the smoothness of the deformation fields [60, 58].
Hardware Specification	Yes	We have 2 two sets of pre-training devices. The A800 80Gx8 device is used for local pre-train and the H100 40Gx8 device is for streaming pre-train. The device we use for fine-tuning is NVIDIA 4090 24Gx4
Software Dependencies	No	The paper refers to public GitHub implementations of baselines (Swin-Transformer, Transmorph, BSAFusion, Inverse SR, VCM, Uni FMIR) that were adapted or built upon. However, it does not explicitly state the specific version numbers for key software dependencies such as Python, PyTorch, or CUDA that were used for their own implementation of Orochi.
Experiment Setup	Yes	Table 6: Pretraining Configuration Parameters for Orochi-B provides detailed hyperparameters including img_size, patch_size, embed_dim, depths, drop_path_rate, batch_size, lr, weight_decay, warmup_ratio, max_epoch, and optimizer/scheduler details. Additionally, it states, 'Fine-tuning We followed the same setups as our code base for each task (see Section ??), including the tuning resolution, epoch number, optimizer configurations and loss designs.'