reproducibilityindex.ai

MEDICAL IMAGE UNDERSTANDING WITH PRETRAINED VISION LANGUAGE MODELS: A COMPREHENSIVE STUDY

Authors: Ziyuan Qin, Huahui Yi, Qicheng Lao, Kang Li

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on thirteen different medical datasets across various modalities, showing that our well-designed prompts greatly improve the zero-shot performance compared to the default prompts, and our fine-tuned models surpass the supervised models by a significant margin.
Researcher Affiliation	Academia	Ziyuan Qin1 Huahui Yi1 Qicheng Lao2,4 Kang Li1,3,4 1West China Biomedical Big Data Center, West China Hospital, Sichuan University 2School of Artificial Intelligence, BUPT 3Sichuan University Pittsburgh Institute 4Shanghai AI-Lab
Pseudocode	No	The paper does not contain any structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Code and more information could be found at https://github.com/MembrLab/MIU-VL
Open Datasets	Yes	For non-radiology images... The ISIC-16 dataset consists of 1,279 images with 1,282 bboxes... divided into 720/180/379 images for training, validation, and testing. The DFUC2020 dataset... divided into 1,280/320/400 images for training, validation, and testing... The BCCD dataset... split into training, validation, and test sets with 765, 73, and 36 images, respectively.
Dataset Splits	Yes	The ISIC-16 dataset consists of 1,279 images with 1,282 bboxes... divided into 720/180/379 images for training, validation, and testing. The DFUC2020 dataset... divided into 1,280/320/400 images for training, validation, and testing... The BCCD dataset... split into training, validation, and test sets with 765, 73, and 36 images, respectively.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or processor types used for running the experiments. It only mentions using a 'visual backbone' and 'linguistic backbone'.
Software Dependencies	No	The paper mentions software components like "Pubmed Bert-base-uncased variant", "OFA-base variant", and "MMDetection framework". However, it does not provide specific version numbers for these components as required for reproducibility.
Experiment Setup	Yes	We train our models using Adam optimizer with base learning rate of 1 × 10−4 (1 × 10−5 for the BERT text encoder), and the weight decay is set to 0.05. We freeze the bottom two layers of the image encoder and decay the learning rate by 0.1 when the validation performance plateaus.