Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Controllable Neural Symbolic Regression
Authors: Tommaso Bendinelli, Luca Biggio, Pierre-Alexandre Kamienny
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3. Experiments In this section, we first introduce the datasets and metrics used to evaluate the model and then we present our experiments aimed to assess different properties of NSRw H, including its controllability, and its performance when DPI is available, and when it is not. |
| Researcher Affiliation | Collaboration | 1CSEM SA, Alpnach 2ETH Z urich 3Meta AI 4Sorbonne Universit e, CNRS, ISIR. |
| Pseudocode | No | The paper describes the methods and data generation steps in prose but does not include any explicit pseudocode blocks or clearly labeled algorithm figures. |
| Open Source Code | Yes | Code is available at https: //github.com/Symposium Organization/ Controllable Neural Symbolic Regression. |
| Open Datasets | Yes | AIF: it comprises all the equations with up to 5 independent variables extracted from the publicly available AIFeynman database (Udrescu and Tegmark, 2020). ... For the ODE-Strogatz dataset we followed the approach from (La Cava et al., 2021) and used 75% of the points from the function call fetch data from the PMLB repository for training (Olson et al., 2017) and the remaining for testing. |
| Dataset Splits | No | The paper mentions using training points to fit constants and select best expressions, and for section 3.4, splitting evaluation points (60% for fitting, 40% for selection). However, it does not explicitly provide details for a separate validation dataset split (e.g., percentages or counts) used during model training or hyperparameter tuning. |
| Hardware Specification | Yes | We trained the model with 200 million equations using three NVIDIA Ge Force RTX 3090 for a total of five days with a batch size of 400. |
| Software Dependencies | No | The paper mentions software like Sympy and Adam optimizer but does not specify their version numbers or other software dependencies with version details. |
| Experiment Setup | Yes | We trained the model with 200 million equations using three NVIDIA Ge Force RTX 3090 for a total of five days with a batch size of 400. |