Advancing Radiograph Representation Learning with Masked Record Modeling
Authors: Hong-Yu Zhou, Chenyu Lian, Liansheng Wang, Yizhou Yu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we mainly compare MRM against reportand self-supervised R2L methodologies on 5 well-established public datasets. Average results are reported over three training runs. Specifically, we find that MRM offers superior performance in label-efficient fine-tuning. For instance, MRM achieves 88.5% mean AUC on Che Xpert using 1% labeled data, outperforming previous R2L methods with 100% labels. |
| Researcher Affiliation | Collaboration | Hong-Yu Zhou1,2 Chenyu Lian1 Liansheng Wang1 Yizhou Yu2,3 1School of Informatics, Xiamen University 2Department of Computer Science, The University of Hong Kong 3AI Lab, Deepwise Healthcare whuzhouhongyu@gmail.com, cylian@stu.xmu.edu.cn, lswang@xmu.edu.cn, yizhouy@acm.org |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and models are available at https://github.com/RL4M/ MRM-pytorch. |
| Open Datasets | Yes | We conduct pre-training on MIMIC-CXR (Johnson et al., 2019), one of the largest X-ray datasets, that contains more than 370,000 radiograph images from over 220,000 patient studies. ... We evaluate the pre-trained model on 4 X-ray datasets in the classification tasks, which are NIH Chest X-ray (Wang et al., 2017), Che Xpert (Irvin et al., 2019), RSNA Pneumonia (Shih et al., 2019), and COVID-19 Image Data Collection (Cohen et al., 2020). For the segmentation task, we fine-tune the pre-trained model on SIIM-ACR Pneumothorax Segmentation.3 |
| Dataset Splits | Yes | Che Xpert introduces a multi-label classification problem on chest X-rays. ... The training/validation/test split each constitutes 218,414/5,000/234 images of the whole dataset. ... We adopt the official data split, where the training/validation/test set comprises 25,184/1,500/3,000 images, respectively. ... The training/validation/test split each constitutes 70%/10%/20% of the whole dataset. ... where the training/validation/test set comprises 356/54/99 radiographs, respectively. ... where the training/validation/test set contains 297/43/86 cases, respectively. ... We follow Huang et al. (2021) to construct the training/validation/test split, where each constitutes 70%/15%/15% of the whole dataset. |
| Hardware Specification | Yes | The pre-training experiments were conducted on 4 Ge Force RTX 3080Ti GPUs, and the training time is about 2 days for 200 epochs, requiring 12GB memory from each GPU. ... For fine-tuning on SIIM, we train the segmentation network on 4 Ge Force RTX 3080Ti GPUs. For fine-tuning on other datasets, we train the classification network on a single Ge Force RTX 3080Ti GPU |
| Software Dependencies | Yes | Our code is implemented using Py Torch 1.8.2 (Paszke et al., 2019). |
| Experiment Setup | Yes | Our code is implemented using Py Torch 1.8.2 (Paszke et al., 2019). The pre-training experiments were conducted on 4 Ge Force RTX 3080Ti GPUs, and the training time is about 2 days for 200 epochs, requiring 12GB memory from each GPU. The training batch size is 256. We use Adam W (Loshchilov & Hutter, 2017) as the default optimizer, where the initial learning rate is 1.5e 4, weight decay is 0.05, β1 is 0.9, and β2 is 0.95. The MSE and cross-entropy losses are used for masked image and language modeling, respectively. In practice, we set λ in Eq. 3 to 1. |