FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models
Authors: Jingwei Sun, Ziyue Xu, Hongxu Yin, Dong Yang, Daguang Xu, Yudong Liu, Zhixu Du, Yiran Chen, Holger R Roth
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted experiments on multiple datasets using SOTA PLMs. The results show that Fed BPT reduces the communication cost by more than 500k while achieving comparable results with the baselines that require model parameter access and back-propagation for optimization. |
| Researcher Affiliation | Collaboration | 1Department of Electrical Computer Engineering, Duke University, Durham, USA 2NVIDIA, Santa Clara, USA. |
| Pseudocode | Yes | Algorithm 1 Training Algorithm of Fed BPT. Algorithm 2 CMA-ES update. |
| Open Source Code | Yes | Our code is available in NVIDIA FLARE. |
| Open Datasets | Yes | We conduct experiments on three language understanding datasets: (1) The SST-2 (Socher et al., 2013) is a popular sentiment analysis dataset. (2) The Yelp polarity (yelp) is another sentiment analysis dataset... (3) The AG s News dataset (Open AI) is a large-scale topic classification dataset... |
| Dataset Splits | No | The paper describes constructing a training set and splitting data for IID/non-IID settings but does not explicitly provide details for a validation dataset split or how it was used. |
| Hardware Specification | No | The paper discusses the target deployment environment, mentioning 'edge devices,' 'mobile phones,' and 'AR headsets,' and refers to 'devices with limited resources.' However, it does not provide specific details (e.g., GPU/CPU models, memory) about the hardware used to conduct the reported experiments. |
| Software Dependencies | No | The paper mentions general software frameworks like 'Tensor Flow Lite, Py Torch Mobile, and Apple Core ML' and 'SOTA PLMs' (RoBERTa, Llama 2), but it does not specify any software dependencies with version numbers required to replicate the experiments. |
| Experiment Setup | Yes | FL setup & Hyperparameters We follow Fed Prompt (Zhao et al., 2023) to design our FL setup. The system has ten clients, and all of the clients participate in training in each round. Considering the real world, where many users possess only a limited amount of labeled data, we conduct experiments under few-shot settings. We randomly select 40 samples for each class to construct a training set Dtrain. We conduct experiments in both IID and non-IID settings. More detailed hyperparameters can be found in Appendix B. [...] The initial search step length σ1 is 1. We set local iteration I to 8 and the local population λk to be 5 for all clients. |