reproducibilityindex.ai

FedBPT: Efficient Federated Black-box Prompt Tuning for Large Language Models

Authors: Jingwei Sun, Ziyue Xu, Hongxu Yin, Dong Yang, Daguang Xu, Yudong Liu, Zhixu Du, Yiran Chen, Holger R Roth

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted experiments on multiple datasets using SOTA PLMs. The results show that Fed BPT reduces the communication cost by more than 500k while achieving comparable results with the baselines that require model parameter access and back-propagation for optimization.
Researcher Affiliation	Collaboration	1Department of Electrical Computer Engineering, Duke University, Durham, USA 2NVIDIA, Santa Clara, USA.
Pseudocode	Yes	Algorithm 1 Training Algorithm of Fed BPT. Algorithm 2 CMA-ES update.
Open Source Code	Yes	Our code is available in NVIDIA FLARE.
Open Datasets	Yes	We conduct experiments on three language understanding datasets: (1) The SST-2 (Socher et al., 2013) is a popular sentiment analysis dataset. (2) The Yelp polarity (yelp) is another sentiment analysis dataset... (3) The AG s News dataset (Open AI) is a large-scale topic classification dataset...
Dataset Splits	No	The paper describes constructing a training set and splitting data for IID/non-IID settings but does not explicitly provide details for a validation dataset split or how it was used.
Hardware Specification	No	The paper discusses the target deployment environment, mentioning 'edge devices,' 'mobile phones,' and 'AR headsets,' and refers to 'devices with limited resources.' However, it does not provide specific details (e.g., GPU/CPU models, memory) about the hardware used to conduct the reported experiments.
Software Dependencies	No	The paper mentions general software frameworks like 'Tensor Flow Lite, Py Torch Mobile, and Apple Core ML' and 'SOTA PLMs' (RoBERTa, Llama 2), but it does not specify any software dependencies with version numbers required to replicate the experiments.
Experiment Setup	Yes	FL setup & Hyperparameters We follow Fed Prompt (Zhao et al., 2023) to design our FL setup. The system has ten clients, and all of the clients participate in training in each round. Considering the real world, where many users possess only a limited amount of labeled data, we conduct experiments under few-shot settings. We randomly select 40 samples for each class to construct a training set Dtrain. We conduct experiments in both IID and non-IID settings. More detailed hyperparameters can be found in Appendix B. [...] The initial search step length σ1 is 1. We set local iteration I to 8 and the local population λk to be 5 for all clients.