Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Authors: Gilad Yehudai, Noah Amsel, Joan Bruna
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that under standard hardness assumptions, none of these three architectures is capable of solving CRQs unless some hyperparameter (depth, embedding dimension, and number of chain of thought tokens, respectively) grows with the size of the input. We then provide constructions for solving CRQs with each architecture. Our main contributions are as follows: 1. In Section 3.1, we present Compositional Reasoning Questions... 2. In Section 4, we prove that transformers with constant depth cannot solve arbitrary CRQs (Theorem 4.3), but transformers with depth L can solve all CRQs of depth up to L (Theorem 4.1). 3. In Section 5, we prove that RNNs with constant hidden dimension cannot solve arbitrary CRQs (Theorem 5.5), but RNNs with O(log n) hidden dimension and constant depth can solve all CRQs of size n (Theorem 5.4). 4. In Section 6, we prove that transformers augmented with O(log n) Co T tokens cannot solve CRQs of size n, but transformers augmented with O(n) Co T tokens can (Theorem 6.1). |
| Researcher Affiliation | Academia | Gilad Yehudai Courant Institute of Mathematical Sciences New York University EMAIL Noah Amsel Courant Institute of Mathematical Sciences New York University EMAIL Joan Bruna Courant Institute of Mathematical Sciences, & Center for Data Science, New York University Center for Computational Mathematics, Flatiron Institute EMAIL |
| Pseudocode | Yes | Algorithm 1: Memory-Rank Sort |
| Open Source Code | No | The NeurIPS Paper Checklist for this submission states: "This paper does not include experiments." and "No data or models are released." There is no explicit statement in the paper about releasing code or a link to a repository. |
| Open Datasets | No | The NeurIPS Paper Checklist for this submission states: "This paper does not include experiments." There is no mention of datasets being made publicly available for download or being used in experiments within the paper text. |
| Dataset Splits | No | The paper does not conduct experiments and therefore does not discuss dataset splits. |
| Hardware Specification | No | The paper does not conduct experiments and therefore does not specify hardware used. |
| Software Dependencies | No | The paper does not conduct experiments and therefore does not specify software dependencies with version numbers. |
| Experiment Setup | No | The paper does not conduct experiments and therefore does not provide details about experimental setup or hyperparameters. |