Symbolic Physics Learner: Discovering governing equations via Monte Carlo tree search
Authors: Fangzheng Sun, Yang Liu, Jian-Xun Wang, Hao Sun
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The efficacy and superiority of the SPL machine are demonstrated by numerical examples, compared with state-of-the-art baselines. The effect of data noise/scarcity on recovery rate. The heatmaps demonstrate the recovery rate of GP and the SPL machine under different data conditions, summarized over 100 independent trials. Ablation Study: We consider four ablation studies by removing: (a) the adaptive scaling in reward calculation, (b) the discount factor ηn that drives equation parsimony in Eq. (2), (c) module transplantation in tree generation, and (d) all of the above. |
| Researcher Affiliation | Academia | Fangzheng Sun1, Yang Liu2, Jian-Xun Wang3, Hao Sun4, 1Northeastern University, Boston, MA, USA; 2University of Chinese Academy of Sciences, Beijing, China; 3University of Notre Dame, Notre Dame, IN, USA; 4Renmin University of China, Beijing, China. |
| Pseudocode | Yes | Algorithm 1: Training SPL for discovering the ith governing equation (i = 1, 2, ..., m) |
| Open Source Code | No | The paper mentions the source code for a baseline model (NGGP) in a footnote: 'NGGP source code: https://github.com/brendenpetersen/deep-symbolic-optimization/tree/master/dso'. However, it does not explicitly state that the source code for their own proposed SPL methodology is being released or provide a link to it. |
| Open Datasets | Yes | This section provides data-driven discovery of the physical laws of relationships between height and time in the cases of free-falling objects with air resistance based on multiple experimental ball-drop datasets (de Silva et al., 2020), which contain the records of 11 different types of balls dropped from a bridge (see Appendix Figure D.1). The second nonlinear dynamics discovery experiment is a chaotic double pendulum system (Asseman et al., 2018). |
| Dataset Splits | Yes | For discovery, each dataset is split into a training set (records from the first 2 seconds) and a testing set (records after 2 seconds). For this discovery, 5,000 denoised random sub-samples from 5 datasets are used for training purposes, and 2,000 random sub-samples from another 2 datasets for validation, and 1 dataset for testing. |
| Hardware Specification | Yes | All simulations are performed on a standard workstation with a NVIDIA Ge Force RTX 2080Ti GPU. |
| Software Dependencies | No | The paper mentions using 'gplearn python package' for the GP-based symbolic regressor and 'Matlab ode113' function for generating synthetic data, but it does not specify version numbers for these or any other key software dependencies like programming languages or deep learning frameworks. |
| Experiment Setup | Yes | The hyperparameters of the SPL machine are set as η = 0.99, tmax = 50, and 10,000 episodes of training is regarded as one trail. The hyperparameters are set as η = 0.9999, tmax = 20, and one single discovery is built upon 6,000 episodes of training. The hyperparameters are set as η = 1, tmax = 20, and 40,000 episodes of training are regarded as one trail. |