Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models
Authors: Javier Gonzalez, Aditya Nori
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | ii) Empirical tests on various reasoning problems as well as several insights about the reasoning abilities of language models in the GPT family. |
| Researcher Affiliation | Industry | Javier González Gonzalez.Javier@microsoft.com Microsoft Research, Cambridge Aditya V. Nori Aditya.Nori@microsoft.com Microsoft Research, Cambridge |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The code to reproduce the analyses and figures can be provided upon request, and will be made open source if this work is accepted for publication. |
| Open Datasets | No | The paper uses generated datasets based on mathematical rules (e.g., "natural numbers N from 1 to 400") and does not provide concrete access information (link, DOI, citation) to a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or test sets. |
| Hardware Specification | No | Experiments do not require extensive compute resources (like large memory of GPUs) since they only require calls to GPT API models and some simple local computation. The experiments of this work can be reproduced in any personal laptop. |
| Software Dependencies | No | The paper mentions using GPT-2, GPT-3.5-turbo, and GPT-4 models but does not list specific software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers. |
| Experiment Setup | Yes | We test the reasoning abilities of an LLM using natural numbers N from 1 to 400. This is shown in Figure 2(A). Direct Prompt: "Does 6 divide { X }? Use the factor method to answer this question. Be as concise as possible." Counterfactual Prompt: "Imagine that { X } { has / has not } 3 as prime factor while retaining all its other prime factors. With this assumption does {self.divisor} divide { X }? Use the factor method to answer this question. Be as concise as possible." |