reproducibilityindex.ai

Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models

Authors: Chengzhengxu Li, Xiaoming Liu, Zhaohan Zhang, Yichen Wang, Chen Liu, Yu Lan, Chao Shen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our idea improves comparison prompt optimization methods by 1.42% for soft prompt generalization and 2.16% for hard prompt generalization in accuracy on the multi-source domain generalization setting, while maintaining satisfying in-domain performance.
Researcher Affiliation	Academia	1Faculty of Electronic and Information Engineering, Xi an Jiaotong University 2Queen Mary University of London, London, UK 3University of Chicago
Pseudocode	Yes	Algorithm 1 shows the detailed process of concentrative soft prompt optimization in 4.1. It also reveals that our method can be widely applied to different soft prompt optimization methods to improve their domain generalization capabilities. Algorithm 2. Concentrative Hard Prompt Optimization
Open Source Code	Yes	Our codes are available at https://github.com/czx-li/Concentrate-Attention
Open Datasets	Yes	We select the SST-2 Socher et al. [2013], MR Pang and Lee [2005], and CR Hu and Liu [2004] datasets for sentiment classification, and the WNLI, QNLI, and RTE datasets from GLUE Wang et al. [2018] for NLI tasks.
Dataset Splits	Yes	For all tasks, we randomly select 32 samples from each source domain as the training set to simulate MFDG setting. We use the same approach to build the validation set and ensure that the number of labels in the training and validation sets is balanced.
Hardware Specification	Yes	All experimental results are the average results of 10 different random seeds on a single NVIDIA A100 GPU.
Software Dependencies	No	I could not find specific version numbers for key software components or libraries, only general mentions of models (e.g., RoBERTa-Large) and optimizers (Adam W).
Experiment Setup	Yes	For Soft Prompt Tuning, we replace the Manual Prompt tokens with five soft tokens in the same positions, and optimize them using Adam W Loshchilov and Hutter [2017] optimizer with learning rate 2 × 10−5 and batch size 32 for 300 epochs. For Prefix Tuning and P-Tuning v2, we apply the Adam W optimizer with a learning rate of 2 × 10−4 and train for 100 epochs. The mini batch size is 8 and prompt length is set as 10. The setting of hard prompt optimization baselines (In-Context Demo, DP2O, GrIPS and RLPrompt) follows Li et al. [2024].