Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Transformer-based Planning for Symbolic Regression
Authors: Parshin Shojaee, Kazem Meidani, Amir Barati Farimani, Chandan Reddy
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on various datasets show that our approach outperforms state-of-the-art methods, enhancing the model s fitting-complexity trade-off, extrapolation abilities, and robustness to noise. |
| Researcher Affiliation | Academia | Parshin Shojaee 1 , Kazem Meidani 2, Amir Barati Farimani 2,3 , Chandan K. Reddy1 1 Department of Computer Science, Virginia Tech 2 Department of Mechanical Engineering, Carnegie Mellon University 3 Machine Learning Department, Carnegie Mellon University |
| Pseudocode | No | The paper describes the steps of TPSR but does not present them in a structured pseudocode or algorithm block format. |
| Open Source Code | Yes | 1The codes are available at: https://github.com/deep-symbolic-mathematics/TPSR |
| Open Datasets | Yes | We evaluate TPSR and various baseline methods on standard SR benchmark datasets from Penn Machine Learning Benchmark (PMLB) [43] studied in SRBench [42], as well as In-domain Synthetic Data generated based on [38, 18]. The benchmark datasets include 119 equations from Feynman Lectures on Physics database series2 [44], 14 symbolic regression problems from the ODE-Strogatz database3 [45], and 57 Black-box4 regression problems without known underlying equations. |
| Dataset Splits | No | The paper mentions '400 validation functions' for the In-domain Synthetic Data and uses SRBench datasets, but does not provide specific train/validation/test split percentages, sample counts, or detailed splitting methodology (e.g., random seed, stratified splitting) needed for reproducibility. It implies a 'test set' for evaluation and 'training points' for some analysis, but lacks explicit, comprehensive split details for all datasets. |
| Hardware Specification | No | The paper does not provide specific details on the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the implementation or experimentation. |
| Experiment Setup | Yes | For the E2E baseline, we use the settings reported in [18], including beam/sample size of C = 10 candidates, and the refinement of all the candidates K = 10. For our model, we use the width of tree search as kmax = 3, number of rollouts r = 3, and simulation beam size b = 1 as the default setting. |