reproducibilityindex.ai

Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients

Authors: Brenden K Petersen, Mikel Landajuela Larma, Terrell N. Mundhenk, Claudio Prata Santiago, Soo Kyung Kim, Joanne Taery Kim

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our algorithm outperforms several baseline methods (including Eureqa, the gold standard for symbolic regression) in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. We demonstrate that DSR outperforms several baseline methods, including two commercial software algorithms. In Table 1, we report the recovery rate for each benchmark.
Researcher Affiliation	Academia	Brenden K. Petersen Lawrence Livermore National Laboratory Livermore, CA, USA bp@llnl.govMikel Landajuela Larma Lawrence Livermore National Laboratory Livermore, CA, USA landajuelala1@llnl.govT. Nathan Mundhenk Lawrence Livermore National Laboratory Livermore, CA, USA mundhenk1@llnl.govClaudio P. Santiago Lawrence Livermore National Laboratory Livermore, CA, USA santiago10@llnl.govSoo K. Kim Lawrence Livermore National Laboratory Livermore, CA, USA kim79@llnl.govJoanne T. Kim Lawrence Livermore National Laboratory Livermore, CA, USA kim102@llnl.gov
Pseudocode	Yes	Pseudocode for DSR is shown in Algorithm 1. Source code is made available at https://github.com/brendenpetersen/deep-symbolic-regression. Pseudocode for sampling an expression from the RNN. The sampling process in DSR (line 4 of Algorithm 1) is more complicated than typical autoregressive sampling procedures due to applying constraints in situ and providing hierarchical information to the RNN. Thus, we provide pseudocode for this process in Algorithm 2. Within this algorithm, the function Arity(τi) simply returns the arity (number of arguments) of token τi, i.e. two for binary operators, one for unary operators, or zero for input variables or constants.
Open Source Code	Yes	Pseudocode for DSR is shown in Algorithm 1. Source code is made available at https://github.com/brendenpetersen/deep-symbolic-regression.
Open Datasets	Yes	We evaluated DSR on the Nguyen symbolic regression benchmark suite (Uy et al., 2011), a set of 12 commonly used benchmark expressions developed and vetted by the symbolic regression community (White et al., 2013). The training data is used to compute the reward for each candidate expression, the test data is used to evaluate the best found candidate expression at the end of training, and the ground truth expression is used to determine whether the best found candidate expression was correctly recovered. U(a, b, c) denotes c random points uniformly sampled between a and b for each input variable; training and test datasets use different random seeds.
Dataset Splits	No	The paper only explicitly describes training and test datasets, but no distinct validation dataset split for the overall experiment setup.
Hardware Specification	Yes	Experiments were executed on an Intel Xeon E5-2695 v4 equipped with NVIDIA Tesla P100 GPUs, with 32 cores per node, 2 GPUs per node, and 256 GB RAM per node.
Software Dependencies	No	The paper mentions software packages and platforms like 'deap', 'SymPy', 'Data Robot platform', and 'Mathematica', but does not provide specific version numbers for any of them or for other potential software dependencies like deep learning frameworks.
Experiment Setup	Yes	Hyperparameters were tuned by performing grid search on benchmarks Nguyen-7 and Nguyen-10. For each hyperparameter combination, we performed 10 independent training runs of the algorithm for 1M total expression evaluations. We selected the hyperparameter combination with the highest average recovery rate, with ties broken by lowest average NRMSE. For all algorithms, the best found hyperparameters were used for all experiments and all benchmark expressions. ... For these three algorithms, the space of hyperparameters considered was batch size {250, 500, 1000}, learning rate {0.0003, 0.0005, 0.001}, and entropy weight λH {0.01, 0.05, 0.1}. ... The ﬁnal tuned hyperparameters are listed in Table 3. (Note that hyperparameters were tuned independently for each algorithm; identical values across algorithms are incidental.) For GP, the space of hyperparameters considered was population size {100, 250, 500, 1000}, tournament size {2, 3, 5, 10}, mutation probability {0.01, 0.03, 0.05, 0.10, 0.15}, crossover probability {0.25, 0.50, 0.75, 0.90, 0.95}, and post hoc constraints {TRUE, FALSE} (800 combinations). The ﬁnal tuned hyperparameters are listed in Table 4.