Hamiltonian Neural Networks

Authors: Samuel Greydanus, Misko Dzamba, Jason Yosinski

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our models on problems where conservation of energy is important, including the two-body problem and pixel observations of a pendulum. Our model trains faster and generalizes better than a regular neural network. We found that HNNs train as quickly as baseline models and converge to similar final losses. Table 1 shows their relative performance over the three tasks.
Researcher Affiliation Industry Sam Greydanus Google Brain sgrey@google.com Misko Dzamba Pet Cube mouse9911@gmail.com Jason Yosinski Uber AI Labs yosinski@uber.com
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes We make our code available at github.com/greydanus/hamiltonian-nn.
Open Datasets Yes We constructed training and test sets of 25 trajectories each and added Gaussian noise with standard deviation σ2 = 0.1 to every data point. We used fourth-order Runge-Kutta integration to find 200 trajectories of 50 observations each and then performed an 80/20% train/test set split over trajectories.
Dataset Splits No The paper explicitly mentions 'training and test sets' and 'train/test set split' but does not specify a separate validation set for hyperparameter tuning or early stopping during training.
Hardware Specification No The paper does not specify any hardware details (e.g., specific GPU models, CPU types, or cloud computing instances) used for running the experiments.
Software Dependencies No The paper mentions using the 'Adam optimizer' and 'scipy.integrate.solve_ivp' but does not provide specific version numbers for these or any other software dependencies, which would be necessary for full reproducibility.
Experiment Setup Yes In all three tasks, we trained our models with a learning rate of 10 3 and used the Adam optimizer [20]. Since the training sets were small, we set the batch size to be the total number of examples. ...All of our models had three layers, 200 hidden units, and tanh activations. We trained them for 2000 gradient steps... this time we trained for 10,000 gradient steps and used a batch size of 200. ...integrator in scipy.integrate.solve_ivp and set the error tolerance to 10 9. We found that using a small amount of weight decay, 10 5 in this case, was beneficial.