reproducibilityindex.ai

Discovering Symbolic Models from Deep Learning with Inductive Biases

Authors: Miles Cranmer, Alvaro Sanchez Gonzalez, Peter Battaglia, Rui Xu, Kyle Cranmer, David Spergel, Shirley Ho

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we present three speciﬁc case studies where we apply our proposed framework using additional inductive biases. We train our Newtonian dynamics GNs on data for simple N-body systems with known force laws. We then apply our technique to recover the known force laws via the representations learned by the message function φe. Data. The dataset consists of N-body particle simulations in two and three dimensions, under different interaction laws. To evaluate the learned models, we generate a new dataset from a different random seed. We ﬁnd that the model with L1 regularization has the greatest prediction performance in most cases (see table 3).
Researcher Affiliation	Collaboration	Miles Cranmer1 Alvaro Sanchez-Gonzalez2 Peter Battaglia2 Rui Xu1 Kyle Cranmer3 David Spergel4,1 Shirley Ho4,3,1,5 1 Princeton University, Princeton, USA 2 Deep Mind, London, UK 3 New York University, New York City, USA 4 Flatiron Institute, New York City, USA 5 Carnegie Mellon University, Pittsburgh, USA
Pseudocode	No	The paper describes the framework steps but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code for our models and experiments can be found at https://github.com/Miles Cranmer/symbolic_ deep_learning.
Open Datasets	Yes	The dataset consists of N-body particle simulations in two and three dimensions, under different interaction laws. We study this problem with the open sourced N-body dark matter simulations from [49].
Dataset Splits	No	The paper mentions training data and out-of-distribution data, but does not explicitly specify or describe a validation set or clear train/validation/test splits for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or processing units used for running the experiments.
Software Dependencies	No	The paper lists software packages used (e.g., PyTorch, TensorFlow, Jax, numpy, scipy, sklearn, jupyter, matplotlib, pandas, torch_geometric) but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We train them with a decaying learning schedule using Adam [43]. To investigate the importance of the size of the message representations for interpreting the messages as forces, we train our GN using 4 different strategies: 1. Standard, a GN with 100 message components; 2. Bottleneck, a GN with the number of message components matching the dimensionality of the problem (2 or 3); 3. L1, same as Standard but using a L1 regularization loss term on the messages with a weight of 10 2; and 4. KL same as Standard but regularizing the messages using the Kullback-Leibler (KL) divergence with respect to Gaussian prior. Training details are the same as for the Newtonian simulations, but we switch to 500 hidden units after hyperparameter tuning based on GN accuracy.