SegVol: Universal and Interactive Volumetric Medical Image Segmentation

Authors: Yuxin Du, Fan BAI, Tiejun Huang, Bo Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on 22 anatomical segmentation tasks verify that Seg Vol outperforms the competitors in 19 tasks, with improvements up to 37.24% compared to the runner-up methods. We demonstrate the effectiveness and importance of specific designs by ablation study.
Researcher Affiliation Collaboration Yuxin Du1,2, Fan Bai1,2,3, Tiejun Huang2,4, Bo Zhao1,2 1School of Artificial Intelligence, Shanghai Jiao Tong University 2BAAI 3The Chinese University of Hong Kong 4Peking University
Pseudocode Yes The detailed fine-tuning algorithm of Seg Vol is presented in Section B. The detailed fine-tuning algorithm of Seg Vol is presented in Section B. ... we abstract the core training code as Algorithm 1 and Figure 8 to clarify the training process of Seg Vol.
Open Source Code Yes The model and code are publicly available at: https://github.com/BAAI-DCAI/Seg Vol.
Open Datasets Yes Doing our utmost, we collected 25 open-source segmentation CT datasets, including CHAOS[40, 41, 42], Ha N-Seg[43], AMOS22[44], Abdomen CT-1k[45], Ki TS23[46], Ki PA22[47, 48, 49, 50], Ki TS19[51], BTCV[52], Pancreas-CT[53, 54, 35], 3D-IRCADB[55], FLARE22[56, 57], Total Segmentator[58], CT-ORG[33, 34, 24, 35], Ver Se19, Ver Se20[59, 60, 61], SLIVER07[62], QUBIQ[63], six MSD datasets[56], LUNA16[36], and WORD[64]. Their detailed information and availability are shown in the Section A.
Dataset Splits Yes Each subset of the joint dataset is split into 80% training data and 20% test data. To compare with these SAM-like interactive segmentation models, we evaluate the models on 1,778 cases from the validation set of AMOS22[44], the whole novel annotated set of Uni-versal Lesion Segmentation Challenge 23(ULS23)[74], and the released labeled set of Seg THOR[75].
Hardware Specification Yes All the above training process is implemented on 8 NVIDIA A100-SXM4-40GB.
Software Dependencies No The paper mentions using 'Sim MIM algorithm[69]' and 'Adam W optimizer[73]' but does not provide specific version numbers for software or libraries like Python, PyTorch, or CUDA.
Experiment Setup Yes During the pre-training, we follow Sim MIM algorithm[69] to train the 3D Vi T encoder of Seg Vol on the collected 96K CTs for 2000 epochs. In the supervised fine-tuning stage, we train Seg Vol (with the text encoder frozen) on the labeled 25 volumetric medical image segmentation datasets for 270 epochs with batch size 32 and input size (32, 256, 256), using Adam W optimizer[73].