Extensible Prompts for Language Models on Zero-shot Language Style Customization

Authors: Tao Ge, Hu Jing, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment X-Prompt for zero-shot language style customization as a case study. The promising results of X-Prompt demonstrate its potential to facilitate advanced interaction beyond the natural language interface, bridging the communication gap between humans and LLMs.
Researcher Affiliation Industry Tao Ge Jing Hu Li Dong Shaoguang Mao Yan Xia Xun Wang Si-Qing Chen Furu Wei Microsoft {tage,v-hjing,lidong1,shamao,yanxia,xunwang}@microsoft.com {sqchen,fuwei}@microsoft.com
Pseudocode No The paper describes methods and processes in prose and illustrates them with figures, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a statement about releasing open-source code for the described methodology, nor does it include a direct link to a code repository.
Open Datasets Yes We use the publicly available Top 20 most followed users in Twitter social platform dataset4 which contains over 50K tweets from 20 users (20-user dataset), and the Sentiment dataset5 from which we extract top 800 users (in total 68K) tweets (800-user dataset) to verify the capability of X-Prompt to instruct an LM to generate user-specific language. 4https://shorturl.at/htDHT 5https://shorturl.at/pvBLX
Dataset Splits Yes We split the datasets in 90/5/5 by user for training, validation and test.
Hardware Specification Yes We run up to 6000 updates with a global batch size of 8192 tokens on 8 Nvidia V100 GPUs using Deep Speed Ze RO-2 (Rajbhandari et al., 2020).
Software Dependencies No The paper mentions using Adam optimizer, Deep Speed Ze RO-2, OPT-6.7b, and BERT-base, but it does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We use Adam optimizer (Kingma and Ba, 2014) with the max learning rate of 2e-4 with a warmup for the first 10% training steps followed by a linear decay. We run up to 6000 updates with a global batch size of 8192 tokens on 8 Nvidia V100 GPUs using Deep Speed Ze RO-2 (Rajbhandari et al., 2020).