Multimodal Representation Distribution Learning for Medical Image Segmentation

Authors: Chao Huang, Weichao Cai, Qiuping Jiang, Zhihua Wang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three datasets show that our method has superior performance.
Researcher Affiliation Academia Chao Huang1 , Weichao Cai1 , Qiuping Jiang2 and Zhihua Wang3 1School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University 2School of Information Science and Engineering, Ningbo University 3Department of Engineering, Shenzhen MSU-BIT University
Pseudocode No The paper describes its methodology using architectural descriptions and mathematical equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Source codes will be available at https://github.com/GPIOX/Multimodal.git.
Open Datasets Yes The proposed method is evaluated on three medical image segmentation datasets: Mo Nu Seg [Kumar et al., 2017], Mos Med Data+ [Li et al., 2023b], and Gla S [Sirinukunwattana et al., 2017].
Dataset Splits Yes The ratio of training, validation, and test sets are the same as in [Li et al., 2023b].
Hardware Specification Yes All experiments are conducted on a single NVIDIA RTX 3090 GPU with 24GB memory.
Software Dependencies No The paper mentions optimizers like Adam W and schedulers but does not specify software versions for programming languages or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The initial learning rate is set to 1e-3 for all datasets. Image input sizes are 224 × 224 both for Mo Nu Seg, Gla S, and Mos Med Data+. An early stop mechanism is adopted until the performance of the model does not increase for 50 epochs. The batch size is 2 for Mo Nu Seg and Gla S and 24 for Mos Med Data+. The default number of learnable features K is set to 32. Base on experiments, we set λ1 = 0.5, λ2 = 0.5, and λ3 = 1.