LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation

Authors: Suhyeon Lee, Won Jun Kim, Jinho Chang, Jong Chul Ye

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare the performance of LLM-CXR against similar contemporary models across all tasks performed by LLM-CXR. For CXR-to-report generation, we compare our results with Uni XGen (Lee et al., 2023), Xray GPT (Thawkar et al., 2023), Rad FM Wu et al. (2023a), IFCC Delbrouck et al. (2022) and R2Gen Chen et al. (2020); for CXR-VQA, with Xray GPT (Thawkar et al., 2023), Rad FM Wu et al. (2023a), and ELIXR (Xu et al., 2023a); and for text-to-CXR generation, with Uni XGen (Lee et al., 2023) and Roent Gen (Chambon et al., 2022).
Researcher Affiliation Academia Suhyeon Lee , Won Jun Kim , Jinho Chang & Jong Chul Ye Korea Advanced Institute of Science & Technology, {suhyeon.lee, wonjun, jinhojsk515, jong.ye}@kaist.ac.kr
Pseudocode No The paper describes the methodology in prose and uses mathematical equations, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes We release all code and model checkpoints upon publication along with step-by-step guidance to reproduce the methods explained in Section 2 so that anyone can reproduce our results2.https://github.com/hyn2028/llm-cxr
Open Datasets Yes We used MIMIC-CXR v2.0.0 (Johnson et al., 2019a) as our dataset of CXR-report pairs. The data set consists of 377,110 CXRs from 227,835 radiology studies.
Dataset Splits No The train-test split used the standard split of MIMIC-CXR-JPG (Johnson et al., 2019b). The test set sizes before and after pruning are 368,960 and 70,403, respectively.
Hardware Specification Yes Training took about 14.5 hours for stage 1 and about 1.5 hours for stage 2 using NVIDIA A100 40GB 8.
Software Dependencies No The paper mentions `Torch XRay Vision` library and `Lla Ma 2 (Llama2-13b-chat-hf)` model, but does not provide specific version numbers for general software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes Model training was performed for 590k steps with the Adam (Kingma & Ba, 2014) optimizer, with a batch size of 2 and a learning rate of 4.5e-6.