Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
From Euler to AI: Unifying Formulas for Mathematical Constants
Authors: Tomer Raz, Michael Shalyt, Elyasheev Leibtag, Rotem Kalisch, Shachar Weinbaum, Yaron Hadad, Ido Kaminer
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Applying this approach to 455,050 ar Xiv papers, we validate 385 distinct formulas for π and prove relations between 360 (94%) of them, of which 166 (43%) can be derived from a single mathematical object linking canonical formulas by Euler, Gauss, Brouncker, and newer ones from algorithmic discoveries by the Ramanujan Machine. Our system combines large language models (LLMs) for systematic formula harvesting, an LLM-code feedback loop for validation, and a novel symbolic algorithm for clustering and eventual unification. We demonstrate this methodology on the hallmark case of π, an ideal testing ground for symbolic unification. Benchmarking |
| Researcher Affiliation | Academia | Tomer Raz Michael Shalyt Elyasheev Leibtag Rotem Kalisch Shachar Weinbaum Yaron Hadad Ido Kaminer Technion Israel Institute of Technology, Haifa 3200003, Israel. Corresponding author: EMAIL |
| Pseudocode | Yes | Figure 4: The matching algorithm: connecting polynomial linear recurrences. This algorithm is demonstrated here for polynomial continued fractions (PCFs) but can be generalized to any linear polynomial recurrence. Appendix C Algorithms: This section contains an in-depth description of the algorithms discussed in Section 3. The algorithms are ordered top-down, from the highest level algorithm to the lowest. |
| Open Source Code | Yes | Project repository: https://github.com/Ramanujan Machine/euler2ai |
| Open Datasets | Yes | Applying this approach to 455,050 ar Xiv papers... [5] ar Xiv.org submitters. Kaggle ar Xiv dataset, 2024. 455,050 articles from the following categories which were indexed in the ar Xiv metadata dataset [5] as of 24 November, 2024, were scraped. |
| Dataset Splits | No | The paper processes 455,050 arXiv articles to extract and validate 385 distinct formulas for pi, which are then used as input for the unification algorithm. This describes a data processing pipeline and subsequent analysis of the processed data, rather than explicit train/test/validation splits for evaluating a model's performance on a dataset. |
| Hardware Specification | Yes | All algorithms used in the pipeline were run on a 13th Gen i5-13500H Intel Core and are available at https://github.com/Ramanujan Machine/euler2ai. Runs required for the sensitivity study were conducted on the Technion High Performance Computing Zeus Cluster. |
| Software Dependencies | No | The paper mentions several software components like Sym Py [38], Mathematica package by RISC [33], and Maple package [57] for minimality, and LLMs such as Open AI s GPT-4o, Claude 3.7 Sonnet, and Gemini 2.5 Pro Preview. However, it does not provide specific version numbers for Sym Py, Mathematica, or Maple, which are key for reproducible software dependencies. |
| Experiment Setup | Yes | In our experiments, we use N = 200 partial sums when converting each series into a corresponding recurrence. The hyperparameter sensitivity study (Appendix D) supports this choice. UMAPS with N 2d + 1 suffices to recover the coboundary matrix. Appendix D provides a detailed sensitivity study for different δ-clustering granularities and similarity thresholds, values of UMAPS's fit depth (N), and the sensitivity of RISC's Guess algorithm to its fit depth (N). |