PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications

Authors: Dingkang Yang, Jinjie Wei, Dongling Xiao, Shunli Wang, Tong Wu, Gang Li, Mingcheng Li, Shuaibing Wang, Jiawei Chen, Yue Jiang, Qingyao Xu, Ke Li, Peng Zhai, Lihua Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive results based on the metrics, GPT-4, and doctor evaluations on distinct downstream tasks show that Pediatrics GPT consistently outperforms previous Chinese medical LLMs.
Researcher Affiliation Collaboration 1Academy for Engineering and Technology, Fudan University, Shanghai, China 2Tencent Youtu Lab, Shanghai, China
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The project and data will be released at https://github.com/ydk122024/Pediatrics GPT.
Open Datasets Yes Motivated by these observations, we construct Ped Corpus, a high-quality dataset with over 300,000 instructions across single-turn and multi-turn medical conversations. Besides containing generalist healthcare data, Ped Corpus incorporates multi-dimensional corpora from pediatric textbooks, guidelines, and knowledge graphs to ensure medical knowledge s accuracy.
Dataset Splits Yes We specify eval_steps at 100 and save the best-performing weights on the validation set to ensure optimal results.
Hardware Specification Yes The model training is accomplished through the Py Torch platform with Accelerate and Deep Speed packages using eight Nvidia A800 GPUs.
Software Dependencies No The paper mentions using 'Py Torch platform with Accelerate and Deep Speed packages' but does not specify version numbers for any of these software components.
Experiment Setup Yes More detailed hyper-parameter configurations for different stages are shown in Appendix C.