SegVol: Universal and Interactive Volumetric Medical Image Segmentation
Authors: Yuxin Du, Fan BAI, Tiejun Huang, Bo Zhao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on 22 anatomical segmentation tasks verify that Seg Vol outperforms the competitors in 19 tasks, with improvements up to 37.24% compared to the runner-up methods. We demonstrate the effectiveness and importance of specific designs by ablation study. |
| Researcher Affiliation | Collaboration | Yuxin Du1,2, Fan Bai1,2,3, Tiejun Huang2,4, Bo Zhao1,2 1School of Artificial Intelligence, Shanghai Jiao Tong University 2BAAI 3The Chinese University of Hong Kong 4Peking University |
| Pseudocode | Yes | The detailed fine-tuning algorithm of Seg Vol is presented in Section B. The detailed fine-tuning algorithm of Seg Vol is presented in Section B. ... we abstract the core training code as Algorithm 1 and Figure 8 to clarify the training process of Seg Vol. |
| Open Source Code | Yes | The model and code are publicly available at: https://github.com/BAAI-DCAI/Seg Vol. |
| Open Datasets | Yes | Doing our utmost, we collected 25 open-source segmentation CT datasets, including CHAOS[40, 41, 42], Ha N-Seg[43], AMOS22[44], Abdomen CT-1k[45], Ki TS23[46], Ki PA22[47, 48, 49, 50], Ki TS19[51], BTCV[52], Pancreas-CT[53, 54, 35], 3D-IRCADB[55], FLARE22[56, 57], Total Segmentator[58], CT-ORG[33, 34, 24, 35], Ver Se19, Ver Se20[59, 60, 61], SLIVER07[62], QUBIQ[63], six MSD datasets[56], LUNA16[36], and WORD[64]. Their detailed information and availability are shown in the Section A. |
| Dataset Splits | Yes | Each subset of the joint dataset is split into 80% training data and 20% test data. To compare with these SAM-like interactive segmentation models, we evaluate the models on 1,778 cases from the validation set of AMOS22[44], the whole novel annotated set of Uni-versal Lesion Segmentation Challenge 23(ULS23)[74], and the released labeled set of Seg THOR[75]. |
| Hardware Specification | Yes | All the above training process is implemented on 8 NVIDIA A100-SXM4-40GB. |
| Software Dependencies | No | The paper mentions using 'Sim MIM algorithm[69]' and 'Adam W optimizer[73]' but does not provide specific version numbers for software or libraries like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | During the pre-training, we follow Sim MIM algorithm[69] to train the 3D Vi T encoder of Seg Vol on the collected 96K CTs for 2000 epochs. In the supervised fine-tuning stage, we train Seg Vol (with the text encoder frozen) on the labeled 25 volumetric medical image segmentation datasets for 270 epochs with batch size 32 and input size (32, 256, 256), using Adam W optimizer[73]. |