Guiding Large Language Models via Directional Stimulus Prompting
Authors: Zekun Li, Baolin Peng, Pengcheng He, Michel Galley, Jianfeng Gao, Xifeng Yan
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments indicate a consistent improvement in the performance of LLMs such as Chat GPT, Codex, and Instruct GPT on these supervised tasks with minimal labeled data. |
| Researcher Affiliation | Collaboration | University of California, Santa Barbara1 Microsoft2 {zekunli, xyan}@cs.ucsb.edu {bapeng,penhe,mgalley,jfgao}@microsoft.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and data are publicly available.3 https://github.com/Leezekun/Directional-Stimulus-Prompting |
| Open Datasets | Yes | We conduct our experiments on the CNN/Daily Mail dataset, a widelyused news summarization benchmark. ... We conduct experiments on the popular task-oriented dialogue dataset Multi WOZ [7], including both the Multi WOZ2.0 (the original version) and Multi WOZ2.1 version [15]. |
| Dataset Splits | Yes | This dataset contains 287,113 training examples, 13,368 validation examples, and 11,490 test examples. To keep the API usage cost low, we use a subset of 1,000, 2,000, and 4,000 for training, 500 for validation, and 500 for testing. |
| Hardware Specification | Yes | All the experiments are run on a server equipped with 8 NVIDIA RTX A6000 GPUs. |
| Software Dependencies | No | The paper mentions software like T5, Flan-T5, Chat GPT, Codex, Instruct GPT, and the spacy package, but does not provide specific version numbers for these or other ancillary software dependencies. |
| Experiment Setup | Yes | The hyperparameters used in our experiments are detailed in Table 3. ... Supervised fine-tuning (SFT) batch size: 8 epochs: 5 learning rate: 0.00002 ... RL (NLPO) steps per update: 5120 total number of steps: 51200 batch size: 8 epochs per update: 5 learning rate: 0.000002... |