Deep Generative Symbolic Regression
Authors: Samuel Holt, Zhaozhi Qian, Mihaela van der Schaar
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we show that DGSR achieves a higher recovery rate of true equations in the setting of a larger number of input variables, and it is more computationally efficient at inference time than state-of-the-art RL symbolic regression solutions. 5 EXPERIMENTS AND EVALUATION |
| Researcher Affiliation | Academia | Samuel Holt University of Cambridge sih31@cam.ac.uk Zhaozhi Qian University of Cambridge zq224@maths.cam.ac.uk Mihaela van der Schaar University of Cambridge The Alan Turing Institute mv472@cam.ac.uk |
| Pseudocode | Yes | Furthermore, we provide pseudocode for DGSR in Appendix D and show empirically other optimization algorithms can be used with an ablation of these in Section 5.2 and Appendix E. |
| Open Source Code | Yes | Additionally, the code is available at https://github.com/samholt/Deep Generative Symbolic Regression and have a broader research group codebase at https://github.com/vanderschaarlab/Deep Generative Symbolic Regression |
| Open Datasets | Yes | We evaluate DGSR on a set of common equations in natural sciences from the standard SR benchmark problem sets and on a problem set with a large number of input variables (d = 12).... We use equations from the Feynman SR database (Udrescu & Tegmark, 2020)... We also benchmark on SRBench (La Cava et al., 2021)... |
| Dataset Splits | Yes | Additionally we construct a validation set of 100 equations using the same pre-training setup, with a different random seed and check and remove any of the validation equations from the pre-training set. |
| Hardware Specification | Yes | This work was performed using a Intel Core i9-12900K CPU @ 3.20GHz, 64GB RAM with a Nvidia RTX3090 GPU 24GB. |
| Software Dependencies | No | The paper mentions software like PyTorch, Adam optimizer, DEAP, and Sympy, but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | During pre-training we use the vanilla policy gradient (VPG) loss function to train the conditional generator parameters θ. This is detailed in Appendices D, C, and we use the hyperparameters: batch size of k = 500 equations to sample, mini-batch of t = 5 datasets, EWMA coefficient α = 0.5, entropy weight λH = 0.003, minimum equation length = 4, maximum equation length = 30, Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.001 and an early stopping patience of a 100 iterations (of a mini-batch). The hyperparameters for inference time are: batch size of k = 500 equations to sample, entropy weight λH = 0.003, minimum equation length = 4, maximum equation length = 30, PQT queue size = 10, sample selection size = 1, GP generations per iteration = 25, GP cross over probability = 0.5, GP mutation probability = 0.5, GP tournament size = 5, GP mutate tree maximum = 3 and Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.001. We also used ϵ = 0.02 for the risk seeking quantile parameter. |