Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
Authors: Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu, Benyou Wang, Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucci
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Med-Uni C reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities. The experimental outcomes highlight the presence of community bias in cross-lingual VLP. Reducing this bias enhances the performance not only in vision-language tasks but also in uni-modal visual tasks. The source code has been released at https://github.com/SUSTech Bruce/Med-Uni C. |
| Researcher Affiliation | Academia | 1The Ohio State University 2Imperial College London 3Peking University 4The Chinese University of Hong Kong, Shenzhen 5The Hong Kong University of Science and Technology |
| Pseudocode | No | No pseudocode or algorithm blocks are explicitly labeled or structured in the paper. |
| Open Source Code | Yes | The source code has been released at https://github.com/SUSTech Bruce/Med-Uni C. |
| Open Datasets | Yes | Dataset We pre-train Med-Uni C framework using MIMIC-CXR [38], which contains CXR images and their corresponding radiology reports in English. Also, we involve Pad Chest [39], which includes CXR images and their corresponding radiology reports collected in Valencia region, Spain. |
| Dataset Splits | Yes | For all downstream tasks, except zero-shot classification, we fine-tune with 1%, 10%, 100% of the training data. More downstream tasks settings, including split information and train/valid/test set details, can be found in the Appendix. |
| Hardware Specification | Yes | Med-Uni C is trained over 50 epochs using an early stop strategy on 16 V100 GPUs with a batch size of 128 per GPU. |
| Software Dependencies | No | The paper mentions Spacy 3 and TF-IDF tool (scikit-learn) but does not provide specific version numbers for these software dependencies. Only 'Spacy 3' is mentioned, not 'Spacy 3.X.X'. |
| Experiment Setup | Yes | Med-Uni C is trained over 50 epochs using an early stop strategy on 16 V100 GPUs with a batch size of 128 per GPU. We utilize Adam W [42] as the optimizer, setting the learning rate to 4e 5 and the weight decay to 5e 2. A linear warm-up and cosine annealing scheduler are also deployed in this process. Additionally, The coefficients λ is set to 5.1e 3 following [36]. |