Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Controllable Neural Symbolic Regression

Authors: Tommaso Bendinelli, Luca Biggio, Pierre-Alexandre Kamienny

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3. Experiments In this section, we first introduce the datasets and metrics used to evaluate the model and then we present our experiments aimed to assess different properties of NSRw H, including its controllability, and its performance when DPI is available, and when it is not.
Researcher Affiliation	Collaboration	1CSEM SA, Alpnach 2ETH Z urich 3Meta AI 4Sorbonne Universit e, CNRS, ISIR.
Pseudocode	No	The paper describes the methods and data generation steps in prose but does not include any explicit pseudocode blocks or clearly labeled algorithm figures.
Open Source Code	Yes	Code is available at https: //github.com/Symposium Organization/ Controllable Neural Symbolic Regression.
Open Datasets	Yes	AIF: it comprises all the equations with up to 5 independent variables extracted from the publicly available AIFeynman database (Udrescu and Tegmark, 2020). ... For the ODE-Strogatz dataset we followed the approach from (La Cava et al., 2021) and used 75% of the points from the function call fetch data from the PMLB repository for training (Olson et al., 2017) and the remaining for testing.
Dataset Splits	No	The paper mentions using training points to fit constants and select best expressions, and for section 3.4, splitting evaluation points (60% for fitting, 40% for selection). However, it does not explicitly provide details for a separate validation dataset split (e.g., percentages or counts) used during model training or hyperparameter tuning.
Hardware Specification	Yes	We trained the model with 200 million equations using three NVIDIA Ge Force RTX 3090 for a total of five days with a batch size of 400.
Software Dependencies	No	The paper mentions software like Sympy and Adam optimizer but does not specify their version numbers or other software dependencies with version details.
Experiment Setup	Yes	We trained the model with 200 million equations using three NVIDIA Ge Force RTX 3090 for a total of five days with a batch size of 400.