Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis

Authors: Kai Hu, Ye Xiao, Yuan Zhang, Xieping Gao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments have verified that our M2CRL significantly enhances the quality of endoscopic video representation learning and exhibits excellent generalization capabilities in multiple downstream tasks (i.e., classification, segmentation and detection).
Researcher Affiliation Academia Kai Hu Xiangtan University kaihu@xtu.edu.cn; Ye Xiao Xiangtan University yxiao@smail.xtu.edu.cn; Yuan Zhang Xiangtan University yuanz@xtu.edu.cn; Xieping Gao Hunan Normal University xpgao@hunnu.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 2Code is publicly available at: https://github.com/MLMIP/MMCRL.
Open Datasets Yes We conduct experiments on 10 publicly available endoscopic video datasets: Colonoscopic [81], SUN-SEG [82], LDPolyp Video [83], Hyper-Kvasir [84], Kvasir-Capsule [85], Cholec Triplet [86], Renji-Hospital [30], Polyp Diag [27], CVC-12k [28], and KUMC [29].
Dataset Splits No For downstream fine-tuning, the procedure is as follows: (1) Classification: ...The dataset is divided into 71 normal videos without polyps and 102 abnormal videos with polyps for training, and 20 normal videos and 60 abnormal videos for testing. (2) Segmentation: ...20 videos allocated for training and 9 videos for testing. (3) Detection: ...36 videos allocated for training and 17 videos for testing. The paper clearly defines training and testing splits for the downstream tasks but does not explicitly detail a separate validation set split.
Hardware Specification Yes Pre-training are conducted on 4 Tesla A100 GPUs.
Software Dependencies No The paper mentions software components such as 'Adam W', 'Trans Unet', and 'STFT', but does not provide specific version numbers for these or other software dependencies like Python or PyTorch.
Experiment Setup Yes The training parameters are shown in Table 7. Pre-training settings: config value optimizer Adam W [89] optimizer momentum β1,β2 = 0.9, 0.999 weight decay 4e 2 base learning rate 2e 5 learning rate schedule cosine schedule [90] warmup epochs 10 pretraining epochs 30 batch size 12 temperature parameters τt, τs = 0.04, 0.07 mask rate ρ = 0.9 attention areas threshold γ = 0.6 momentum coefficient λ = 0.996