Learning to Reason and Memorize with Self-Notes
Authors: Jack Lanchantin, Shubham Toshniwal, Jason Weston, arthur szlam, Sainbayar Sukhbaatar
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across a wide variety of tasks demonstrate that our method can outperform chain-of-thought and scratchpad methods by taking Self-Notes that interleave the input text. |
| Researcher Affiliation | Industry | Jack Lanchantin Meta AI Shubham Toshniwal NVIDIA Jason Weston Meta AI Arthur Szlam Meta AI Sainbayar Sukhbaatar Meta AI |
| Pseudocode | No | The paper describes its methods verbally and with examples but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Reproducibility statement: We will make code and data publicly available. |
| Open Datasets | Yes | We test our method on seven text datasets designed to evaluate multi-step reasoning and state-tracking: a proposed synthetic Toy-Story task, two synthetic program evaluation tasks [11, 16], two real-world chess game tasks [17], and two math word problem tasks previously used to test chain-of-thought prompting, Multi Arith and GSM8K [18, 19]. |
| Dataset Splits | Yes | Table 8: Dataset Statistics. # train # valid # test In domain Out-of domain |
| Hardware Specification | Yes | We fine-tune all of the GPT-2 models on 8 NVIDIA V100 GPUs using an on-site cluster. |
| Software Dependencies | Yes | The GSM8K experiments were done using the text-davinci-003 model with the Open AI API |
| Experiment Setup | Yes | For each non-prompting task, we train for a fixed 30 epochs with a learning rate of 2e-5 and batch size of 32. |