Symbolic Regression via Deep Reinforcement Learning Enhanced Genetic Programming Seeding
Authors: Terrell Mundhenk, Mikel Landajuela, Ruben Glatt, Claudio P Santiago, Daniel faissol, Brenden K Petersen
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a number of common benchmark tasks to recover underlying expressions from a dataset, our method recovers 65% more expressions than a recently published top-performing model using the same experimental setup. |
| Researcher Affiliation | Academia | T. Nathan Mundhenk mundhenk1@llnl.gov Mikel Landajuela landajuelala1@llnl.gov Ruben Glatt glatt1@llnl.gov Claudio P. Santiago prata@llnl.gov Daniel M. Faissol faissol1@llnl.gov Brenden K. Petersen bp@llnl.gov Computational Engineering Division Lawrence Livermore National Laboratory Livermore, CA 94550 |
| Pseudocode | Yes | Algorithm 1 Neural-guided genetic programming population seeding |
| Open Source Code | Yes | Source code is provided at www.github.com/brendenpetersen/deep-symbolic-optimization. |
| Open Datasets | Yes | We used two popular benchmark problem sets to compare our technique to other methods: Nguyen [Uy et al., 2014] and the R rationals [Krawiec and Pawlak, 2013]. Additionally, we introduce a new benchmark problem set with this work, which we call Livermore. |
| Dataset Splits | No | The paper does not explicitly provide specific training/validation/test dataset splits (exact percentages, sample counts, or direct references to predefined splits) needed for reproduction in its main text. |
| Hardware Specification | Yes | Experiments were conducted on 36 core, 2.1 GHz, Intel Xeon E5-2695 workstations. |
| Software Dependencies | No | The paper mentions using 'DEAP' for the genetic programming component, but does not provide specific version numbers for DEAP or any other ancillary software components used in the experiments. |
| Experiment Setup | Yes | For all algorithms, we tuned hyperparameters using Nguyen-7 and R-3 . Hyperparameters are shown in Appendix Table 11. An important hyperparameter in our method is S, the number of GP generations to perform per RNN training step. Figure 2 shows a post-hoc analysis of how performance varies depending on how many GP steps we do between each RNN training step. The optimal number of steps is between 10 and 25. |