Detecting Any instruction-to-answer interaction relationship:Universal Instruction-to-Answer Navigator for Med-VQA
Authors: Zhongze Wu, Hongyan Xu, Yitian Long, Shan You, Xiu Su, Jun Long, Yueyi Luo, Chang Xu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiment Results |
| Researcher Affiliation | Collaboration | 1Central South University, Changsha, Hunan, China 2University of New South Wales, Sydney, Australia 3Vanderbilt University, Nashville, Tennessee, USA 4Sense Time 5University of Sydney, Sydney, Australia. |
| Pseudocode | Yes | Algorithm 1 Token-Level Cut-Mix (TC-Mix) |
| Open Source Code | No | The paper states 'we have made the IAI-Med VQA dataset publicly available', but does not provide concrete access to the source code for the Uni-Med framework itself. |
| Open Datasets | Yes | We use the PMC-VQA dataset (Zhang et al., 2023b), which includes 227K VQA pairs from 149K images... For fine-tuning, we used two medical datasets: VQA-RAD (Nguyen et al., 2019a)... and SLAKE (Liu et al., 2021b)... |
| Dataset Splits | No | For fine-tuning, we used two medical datasets: VQA-RAD (Nguyen et al., 2019a), consisting of 314 radiology images and 3,064 clinician-curated questionand-answer pairs; and SLAKE (Liu et al., 2021b), which offers 642 radiology images and 14K question-and-answer samples, of which we used 70% for training and 30% for testing. |
| Hardware Specification | Yes | The model is trained using the Adam W optimizer, combined with a cosine learning rate scheduler, across 8 Tesla V100 GPUs over 8,000 steps. |
| Software Dependencies | No | The paper mentions 'Adam W optimizer' and 'cosine learning rate scheduler' but does not specify software dependencies with version numbers. |
| Experiment Setup | Yes | We set a global batch size of 128 and a peak learning rate of 2e-5 to optimize performance. |