Neural Symbolic Regression that scales
Authors: Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurelien Lucchi, Giambattista Parascandolo
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show empirically that this approach can re-discover a set of well-known physical equations, and that it improves over time with more data and compute. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, ETH, Z urich, Switzerland 2CSEM SA, Alpnach, Switzerland 3Max Planck Institute for Intelligent Systems, T ubingen, Germany. |
| Pseudocode | Yes | Algorithm 1 Neural Symbolic Regression pre-training |
| Open Source Code | Yes | We release our code and largest pre-trained model 1 1https://github.com/Symposium Organization/ Neural Symbolic Regression That Scales |
| Open Datasets | Yes | AI-Feynman (AIF) First, we consider all the equations with up to 3 independent variables from the AI-Feynman (AIF) database (Udrescu & Tegmark, 2020) 5. The resulting dataset consists of 52 equations extracted from the popular Feynman Lectures on Physics series. 5https://space.mit.edu/home/tegmark/aifeynman.html |
| Dataset Splits | No | We train all models for the same number of iterations, but use early stopping on a held-out validation set to prevent overfitting. |
| Hardware Specification | No | The paper states: 'To make the comparison as fair as possible, we decided to run every method on a single CPU at the time.' This mentions a 'single CPU' but does not provide specific model numbers or detailed specifications of the CPU or any other hardware. |
| Software Dependencies | No | The paper mentions 'the default Py Torch implementation', 'symbolic manipulation library Sym Py (Meurer et al., 2017)', 'Adam', 'BFGS', 'gplearn', and 'sklearn implementation'. However, it does not specify version numbers for PyTorch, SymPy, or other libraries/packages directly used for their method. |
| Experiment Setup | Yes | Encoder and decoder have 11 and 13 million parameters respectively. We sample mini-batches of size B = 150 of the following elements: We train the encoder and decoder jointly to minimize the cross-entropy loss between the ground truth skeleton and the skeleton predicted by the decoder as a regular language model. We use Adam with a learning rate of 10 4, no schedules, and train for 1.5M steps. Our default parameters at test time are beam-size 32, with 4 restarts of BFGS per equation. |