Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models

Authors: Javier Gonzalez, Aditya Nori

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental ii) Empirical tests on various reasoning problems as well as several insights about the reasoning abilities of language models in the GPT family.
Researcher Affiliation Industry Javier González Gonzalez.Javier@microsoft.com Microsoft Research, Cambridge Aditya V. Nori Aditya.Nori@microsoft.com Microsoft Research, Cambridge
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The code to reproduce the analyses and figures can be provided upon request, and will be made open source if this work is accepted for publication.
Open Datasets No The paper uses generated datasets based on mathematical rules (e.g., "natural numbers N from 1 to 400") and does not provide concrete access information (link, DOI, citation) to a publicly available or open dataset.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or test sets.
Hardware Specification No Experiments do not require extensive compute resources (like large memory of GPUs) since they only require calls to GPT API models and some simple local computation. The experiments of this work can be reproduced in any personal laptop.
Software Dependencies No The paper mentions using GPT-2, GPT-3.5-turbo, and GPT-4 models but does not list specific software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers.
Experiment Setup Yes We test the reasoning abilities of an LLM using natural numbers N from 1 to 400. This is shown in Figure 2(A). Direct Prompt: "Does 6 divide { X }? Use the factor method to answer this question. Be as concise as possible." Counterfactual Prompt: "Imagine that { X } { has / has not } 3 as prime factor while retaining all its other prime factors. With this assumption does {self.divisor} divide { X }? Use the factor method to answer this question. Be as concise as possible."