Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Authors: Xinlu Zhang, Shiyang Li, Xianjun Yang, Chenxin Tian, Yao Qin, Linda Ruth Petzold
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method significantly enhances performance in both few-shot and full training settings across three medical knowledge-intensive tasks, achieving up to a 22.57% increase in absolute accuracy compared to SLM fine-tuning without context, and sets new state-of-the-art results in two medical tasks within privacy-restricted scenarios. Further out-of-domain testing and experiments in two general domain datasets showcase its generalizability and broad applicability. |
| Researcher Affiliation | Academia | 1University of California, Santa Barbara 2Chinese Academy of Medical Sciences and Peking Union Medical College |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found; methods are described in prose and diagrams. |
| Open Source Code | Yes | Our code can be found at https: //github.com/XZhang97666/Privacy Boost-SLM. Our codes and generated data are public at https://github.com/XZhang97666/ Privacy Boost-SLM. |
| Open Datasets | Yes | Med QA (Jin et al., 2020) contains 4-way multiplechoice questions from the US Medical Licensing Exam. It has 10,178/1,272/1,273 instances in the training/development/test sets. Results on the development and test sets are reported. |
| Dataset Splits | Yes | Med QA (Jin et al., 2020) contains 4-way multiplechoice questions from the US Medical Licensing Exam. It has 10,178/1,272/1,273 instances in the training/development/test sets. |
| Hardware Specification | Yes | We implement both SFT and FTC based on huggingface transformers Wolf et al. (2020), and train on NVIDIA A40-48GB GPUs. |
| Software Dependencies | No | The paper mentions software like 'huggingface transformers' and 'Adam W' but does not specify their version numbers to allow for reproducible software setup. |
| Experiment Setup | Yes | For all datasets, we utilize Adam W (Loshchilov and Hutter, 2019) as optimizer. For Med QA and HEADQA, we set learning rates of 5 × 10−5, 5 × 10−5, and 2 × 10−6 for Bio Link BERT-Base, Bio Link BERT-Large, and Bio Med LM in both FTC and SFT settings. For Med MCQA, we set learning rates of 2 × 10−5, 2 × 10−5, and 2 × 10−6 for Bio Link BERT-Base, Bio Link BERT-Large, and Bio Med LM in both FTC and SFT settings. For Bio Link BERT-Base and Bio Link BERT-Large, we limit training to 100 epochs with a 200-step warm-up and apply early stopping after 5 epochs without validation improvement. Batch sizes are 8 for few-shot and full-training scenarios across all datasets. For Bio Med LM, we set the training epochs to 10 for all datasets. |