DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting

Authors: Binqian Xu, Xiangbo Shu, Haiyang Mei, Zechen Bai, Basura Fernando, Mike Zheng Shou, Jinhui Tang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on diverse datasets consistently demonstrate that Do FIT excels in cross-domain collaborative training and exhibits significant advantages over conventional FIT methods in alleviating catastrophic forgetting.
Researcher Affiliation Academia Binqian Xu1, Xiangbo Shu1,*, Haiyang Mei2, Zechen Bai2, Basura Fernando3, Mike Zheng Shou2, and Jinhui Tang1 1Nanjing University of Science and Technology 2Show Lab, National University of Singapore 3Institute of High-Performance Computing, A*STAR
Pseudocode Yes A.1 Algorithm Algorithm 1 The training process of Do FIT for two domains
Open Source Code Yes Code is available at https://github.com/1xbq1/Do FIT.
Open Datasets Yes We train our Do FIT on three datasets, i.e., Fin GPT [36], Alpaca-GPT4 [23], and Med Alpaca [2] from the Finance (F), General (G), and Medical (M) domains, respectively. ... FPB [19], Fi QA-SA [18], TFNS [17], and NWGI [33] are all the evaluation datasets on Finance domain. ... Med QA [10], and Med MCQA [22] are the evaluation datasets on M domain.
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits with percentages or sample counts for a validation set.
Hardware Specification Yes In all experiments conducted on one NVIDIA A40, the frozen LLM used is Llama2-7B with 32 layers [27] quantized to int8.
Software Dependencies No The paper mentions Llama2-7B, Lo RA, and Adam W optimizer, but does not provide specific version numbers for these software components, nor for programming languages or libraries like Python or PyTorch.
Experiment Setup Yes In all experiments conducted on one NVIDIA A40, the frozen LLM used is Llama2-7B with 32 layers [27] quantized to int8. The Lo RA rank and alpha are set to 32 and 64, respectively. The maximum sequence length is 512. Following the formatting instructions of Alpaca template [25], the training runs for 200 rounds, with a cosine learning rate scheduler adjusting the learning rate from 5e-5 to 1e-6. In each round, the selected clients are trained 10 steps by Adam W [16] optimizer. The batch size is set to 16. In Fin GPT/Alpaca-GPT4/Med Alpaca training, total 10k/20k/20k samples for 50/20/20 clients, selecting 5/2/2 clients randomly per round.