Clinical-BERT: Vision-Language Pre-training for Radiograph Diagnosis and Reports Generation

Authors: Bin Yan, Mingtao Pei2982-2990

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the pre-training model on Radiograph Diagnosis and Reports Generation tasks across four challenging datasets: MIMIC-CXR, IU X-Ray, COV-CTR, and NIH, and achieve state-of-the-art results for all the tasks, which demonstrates the effectiveness of our pre-training model.
Researcher Affiliation Academia Bin Yan, Mingtao Pei* Beijing Laboratory of Intelligent Information Technology School of Computer Science, Beijing Institute of Technology {bean.yan, peimt}@bit.edu.cn
Pseudocode No The paper describes its algorithms and models in prose and with mathematical formulas, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes We pre-train the Clinical-BERT on MIMIC-CXR (Johnson et al. 2019) dataset... The radiograph reports generation task is conducted on IU X-Ray (Demner-Fushman et al. 2016) and COV-CTR (Li et al. 2020b)... The radiograph diagnosis task is conducted on NIH (Wang et al. 2017).
Dataset Splits Yes We pre-train the Clinical-BERT on MIMIC-CXR (Johnson et al. 2019) dataset... For a fair comparison, we use the official splitting for training, validation, and testing... We randomly split both datasets into training, validation, and testing in the ratio of 7:1:2. The radiograph diagnosis task is conducted on NIH (Wang et al. 2017) dataset... The official splitting set is adopted in the experiment.
Hardware Specification Yes All experiments are run on two Nvidia 3090 GPUs.
Software Dependencies No The paper mentions software components like 'BERT-base', 'Dense Net121', 'Adam W', and 'Jieba', but does not provide specific version numbers for these or any other software libraries or frameworks used.
Experiment Setup Yes The Adam W (Loshchilov and Hutter 2019) optimizer is adopted with a weight decay of 0.01. Batch size is set as 256 with gradient accumulation (every 4 steps). The learning rate for the backbone and the visual extractor are 1e 4 and 5e 5, respectively. We pre-train the model for 50 epochs.