LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation
Authors: Suhyeon Lee, Won Jun Kim, Jinho Chang, Jong Chul Ye
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare the performance of LLM-CXR against similar contemporary models across all tasks performed by LLM-CXR. For CXR-to-report generation, we compare our results with Uni XGen (Lee et al., 2023), Xray GPT (Thawkar et al., 2023), Rad FM Wu et al. (2023a), IFCC Delbrouck et al. (2022) and R2Gen Chen et al. (2020); for CXR-VQA, with Xray GPT (Thawkar et al., 2023), Rad FM Wu et al. (2023a), and ELIXR (Xu et al., 2023a); and for text-to-CXR generation, with Uni XGen (Lee et al., 2023) and Roent Gen (Chambon et al., 2022). |
| Researcher Affiliation | Academia | Suhyeon Lee , Won Jun Kim , Jinho Chang & Jong Chul Ye Korea Advanced Institute of Science & Technology, {suhyeon.lee, wonjun, jinhojsk515, jong.ye}@kaist.ac.kr |
| Pseudocode | No | The paper describes the methodology in prose and uses mathematical equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release all code and model checkpoints upon publication along with step-by-step guidance to reproduce the methods explained in Section 2 so that anyone can reproduce our results2.https://github.com/hyn2028/llm-cxr |
| Open Datasets | Yes | We used MIMIC-CXR v2.0.0 (Johnson et al., 2019a) as our dataset of CXR-report pairs. The data set consists of 377,110 CXRs from 227,835 radiology studies. |
| Dataset Splits | No | The train-test split used the standard split of MIMIC-CXR-JPG (Johnson et al., 2019b). The test set sizes before and after pruning are 368,960 and 70,403, respectively. |
| Hardware Specification | Yes | Training took about 14.5 hours for stage 1 and about 1.5 hours for stage 2 using NVIDIA A100 40GB 8. |
| Software Dependencies | No | The paper mentions `Torch XRay Vision` library and `Lla Ma 2 (Llama2-13b-chat-hf)` model, but does not provide specific version numbers for general software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Model training was performed for 590k steps with the Adam (Kingma & Ba, 2014) optimizer, with a batch size of 2 and a learning rate of 4.5e-6. |