Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis
Authors: Kai Hu, Ye Xiao, Yuan Zhang, Xieping Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have verified that our M2CRL significantly enhances the quality of endoscopic video representation learning and exhibits excellent generalization capabilities in multiple downstream tasks (i.e., classification, segmentation and detection). |
| Researcher Affiliation | Academia | Kai Hu Xiangtan University kaihu@xtu.edu.cn; Ye Xiao Xiangtan University yxiao@smail.xtu.edu.cn; Yuan Zhang Xiangtan University yuanz@xtu.edu.cn; Xieping Gao Hunan Normal University xpgao@hunnu.edu.cn |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 2Code is publicly available at: https://github.com/MLMIP/MMCRL. |
| Open Datasets | Yes | We conduct experiments on 10 publicly available endoscopic video datasets: Colonoscopic [81], SUN-SEG [82], LDPolyp Video [83], Hyper-Kvasir [84], Kvasir-Capsule [85], Cholec Triplet [86], Renji-Hospital [30], Polyp Diag [27], CVC-12k [28], and KUMC [29]. |
| Dataset Splits | No | For downstream fine-tuning, the procedure is as follows: (1) Classification: ...The dataset is divided into 71 normal videos without polyps and 102 abnormal videos with polyps for training, and 20 normal videos and 60 abnormal videos for testing. (2) Segmentation: ...20 videos allocated for training and 9 videos for testing. (3) Detection: ...36 videos allocated for training and 17 videos for testing. The paper clearly defines training and testing splits for the downstream tasks but does not explicitly detail a separate validation set split. |
| Hardware Specification | Yes | Pre-training are conducted on 4 Tesla A100 GPUs. |
| Software Dependencies | No | The paper mentions software components such as 'Adam W', 'Trans Unet', and 'STFT', but does not provide specific version numbers for these or other software dependencies like Python or PyTorch. |
| Experiment Setup | Yes | The training parameters are shown in Table 7. Pre-training settings: config value optimizer Adam W [89] optimizer momentum β1,β2 = 0.9, 0.999 weight decay 4e 2 base learning rate 2e 5 learning rate schedule cosine schedule [90] warmup epochs 10 pretraining epochs 30 batch size 12 temperature parameters τt, τs = 0.04, 0.07 mask rate ρ = 0.9 attention areas threshold γ = 0.6 momentum coefficient λ = 0.996 |