Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias

Authors: Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu, Benyou Wang, Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucci

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Med-Uni C reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities. The experimental outcomes highlight the presence of community bias in cross-lingual VLP. Reducing this bias enhances the performance not only in vision-language tasks but also in uni-modal visual tasks. The source code has been released at https://github.com/SUSTech Bruce/Med-Uni C.
Researcher Affiliation Academia 1The Ohio State University 2Imperial College London 3Peking University 4The Chinese University of Hong Kong, Shenzhen 5The Hong Kong University of Science and Technology
Pseudocode No No pseudocode or algorithm blocks are explicitly labeled or structured in the paper.
Open Source Code Yes The source code has been released at https://github.com/SUSTech Bruce/Med-Uni C.
Open Datasets Yes Dataset We pre-train Med-Uni C framework using MIMIC-CXR [38], which contains CXR images and their corresponding radiology reports in English. Also, we involve Pad Chest [39], which includes CXR images and their corresponding radiology reports collected in Valencia region, Spain.
Dataset Splits Yes For all downstream tasks, except zero-shot classification, we fine-tune with 1%, 10%, 100% of the training data. More downstream tasks settings, including split information and train/valid/test set details, can be found in the Appendix.
Hardware Specification Yes Med-Uni C is trained over 50 epochs using an early stop strategy on 16 V100 GPUs with a batch size of 128 per GPU.
Software Dependencies No The paper mentions Spacy 3 and TF-IDF tool (scikit-learn) but does not provide specific version numbers for these software dependencies. Only 'Spacy 3' is mentioned, not 'Spacy 3.X.X'.
Experiment Setup Yes Med-Uni C is trained over 50 epochs using an early stop strategy on 16 V100 GPUs with a batch size of 128 per GPU. We utilize Adam W [42] as the optimizer, setting the learning rate to 4e 5 and the weight decay to 5e 2. A linear warm-up and cosine annealing scheduler are also deployed in this process. Additionally, The coefficients λ is set to 5.1e 3 following [36].