Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels

Authors: Jizong Peng, Ping Wang, Christian Desrosiers, Marco Pedersoli

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results on five medical image segmentation datasets show that our approach: i) highly boosts the performance of a model trained on a few scans, ii) outperforms previous contrastive and semi-supervised approaches, and iii) reaches close to the performance of a model trained on the full data. We empirically validate our contributions on five well-known medical imaging datasets, and show the proposed approach to outperform the contrastive learning method of [7] as well as several state-of-the-art semi-supervised learning methods for segmentation [41, 43, 50, 58, 60].
Researcher Affiliation Academia Jizong Peng ETS Montreal jizong.peng.1@etsmtl.net Ping Wang ETS Montreal ping.wang.1@ens.etsmtl.ca Christian Desrosiers ETS Montreal christian.desrosiers@etsmtl.ca Marco Pedersoli ETS Montreal marco.pedersoli@etsmtl.ca
Pseudocode No The paper describes mathematical formulations and optimization processes but does not include structured pseudocode or an algorithm block.
Open Source Code No No explicit statement or link indicating that the source code for the methodology is openly available.
Open Datasets Yes Five clinically-relevant benchmark datasets for medical image segmentation are used for our experiments: the Automated Cardiac Diagnosis Challenge (ACDC) dataset [3], the Prostate MR Image Segmentation 2012 Challenge (PROMISE12) dataset [29], and Multi-Modality Whole Heart Segmentation Challenge (MMWHS) dataset [64], as well as the Hippocampus and Spleen segmentation datasets from [1].
Dataset Splits Yes For all datasets, we split images into training, validation and test sets, which remain unchanged during all experiments. We train the model with only a few scans of the dataset as labeled data (the rest of the data is used without annotations as in a semi-supervised setting) and report results in terms of 3D DSC metric [4] on the test set. Details on the training set split, data pre-processing, augmentation methods and evaluation metrics can be found in the Supplementary Material. For all datasets, we report the segmentation performance by varying the number of labeled scans across experiments. For the ACDC dataset, this number ranges from 1 to 4, representing 0.5% to 2% of all available data. For PROMISE12, we use 3 to 7 scans, representing 6% to 14% of the whole data. For MMWHS, we use 1 and 2 annotated scans, corresponding to 10% and 20% of the training data. We use 1 to 4 scans as annotated data for the Hippocampus dataset, representing 0.5% to 0.2% of the whole data, and 2 to 4 scans for the Spleen dataset, which corresponds to 5.7% to 11.4% of the whole available training data.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided. It mentions 'GPU memory' and support from 'Calcul Québec' and 'Compute Canada' but not the specific hardware used for experiments.
Software Dependencies No We use Py Torch [39] as our training framework and, following [7], employ the U-Net architecture [45] as our segmentation network. ... Network parameters are optimized using stochastic gradient descent (SGD) with a RAdam optimizer [30].
Experiment Setup Yes We provide the detailed training hyper-parameters in the Suppl. Material. For the pre-training process, we obtain representations by projecting the encoder s output to a vector of size 256, using a simple MLP network with one hidden layer and Leakly Re LU activation function, following [9]. Our proposed self-paced contrastive learning objective, defined in Equ. (3), involves a learning pace parameter γ set as γ = γstart + (γend γstart) cur_epoch / max_epoch ^ p. A detailed explanation of the experimental setup of each method and results for the other two datasets can be found in the Supplementary Material.