Extensible Prompts for Language Models on Zero-shot Language Style Customization
Authors: Tao Ge, Hu Jing, Li Dong, Shaoguang Mao, Yan Xia, Xun Wang, Si-Qing Chen, Furu Wei
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment X-Prompt for zero-shot language style customization as a case study. The promising results of X-Prompt demonstrate its potential to facilitate advanced interaction beyond the natural language interface, bridging the communication gap between humans and LLMs. |
| Researcher Affiliation | Industry | Tao Ge Jing Hu Li Dong Shaoguang Mao Yan Xia Xun Wang Si-Qing Chen Furu Wei Microsoft {tage,v-hjing,lidong1,shamao,yanxia,xunwang}@microsoft.com {sqchen,fuwei}@microsoft.com |
| Pseudocode | No | The paper describes methods and processes in prose and illustrates them with figures, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a statement about releasing open-source code for the described methodology, nor does it include a direct link to a code repository. |
| Open Datasets | Yes | We use the publicly available Top 20 most followed users in Twitter social platform dataset4 which contains over 50K tweets from 20 users (20-user dataset), and the Sentiment dataset5 from which we extract top 800 users (in total 68K) tweets (800-user dataset) to verify the capability of X-Prompt to instruct an LM to generate user-specific language. 4https://shorturl.at/htDHT 5https://shorturl.at/pvBLX |
| Dataset Splits | Yes | We split the datasets in 90/5/5 by user for training, validation and test. |
| Hardware Specification | Yes | We run up to 6000 updates with a global batch size of 8192 tokens on 8 Nvidia V100 GPUs using Deep Speed Ze RO-2 (Rajbhandari et al., 2020). |
| Software Dependencies | No | The paper mentions using Adam optimizer, Deep Speed Ze RO-2, OPT-6.7b, and BERT-base, but it does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We use Adam optimizer (Kingma and Ba, 2014) with the max learning rate of 2e-4 with a warmup for the first 10% training steps followed by a linear decay. We run up to 6000 updates with a global batch size of 8192 tokens on 8 Nvidia V100 GPUs using Deep Speed Ze RO-2 (Rajbhandari et al., 2020). |