Deconstructing the Inductive Biases of Hamiltonian Neural Networks

Authors: Nate Gruver, Marc Anton Finzi, Samuel Don Stanton, Andrew Gordon Wilson

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper we theoretically and empirically examine the role of these biases. We show that by relaxing the inductive biases of these models, we can match or exceed performance on energy-conserving systems while dramatically improving performance on practical, non-conservative systems. We extend this approach to constructing transition models for common Mujoco environments, showing that our model can appropriately balance inductive biases with the flexibility required for model-based control.
Researcher Affiliation Academia Nate Gruver, Marc Finzi, Samuel Stanton, Andrew Gordon Wilson New York University
Pseudocode No The paper describes methods and algorithms in prose but does not include any structured pseudocode blocks or figures labeled 'Algorithm'.
Open Source Code Yes Code for our experiments can be found at: https://github.com/ngruver/decon-hnn.
Open Datasets Yes We train NODEs and HNNs on trajectories from several Open AI Gym Mujoco environments (Brockman et al., 2016). We select synthetic environments from Finzi et al. (2020) and Finzi et al. (2021) that are derived from a time independent Hamiltonian, where energy is preserved exactly.
Dataset Splits No The paper specifies training and test data splits (e.g., 'The training data was 40K 3-step trajectories... The test data was 200 200-step trajectories...'), but it does not mention or describe a separate validation split for hyperparameter tuning or early stopping.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using 'Adam' optimizer and 'Euler integration rule' but does not specify version numbers for any software libraries or frameworks like PyTorch, TensorFlow, or Python.
Experiment Setup Yes Training: we trained each model for 256 epochs using Adam with a batch size of 200 and weight decay (λ = 1e-4). We used a cosine annealing learning rate schedule, with ηmax = 2e-4, ηmin = 1e-6. Model Architecture Each network was parameterized as a 2-layer MLP with 128 hidden units. Each model used the Euler integration rule with 8 integration steps per transition step.