reproducibilityindex.ai

Deconstructing the Inductive Biases of Hamiltonian Neural Networks

Authors: Nate Gruver, Marc Anton Finzi, Samuel Don Stanton, Andrew Gordon Wilson

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper we theoretically and empirically examine the role of these biases. We show that by relaxing the inductive biases of these models, we can match or exceed performance on energy-conserving systems while dramatically improving performance on practical, non-conservative systems. We extend this approach to constructing transition models for common Mujoco environments, showing that our model can appropriately balance inductive biases with the ﬂexibility required for model-based control.
Researcher Affiliation	Academia	Nate Gruver, Marc Finzi, Samuel Stanton, Andrew Gordon Wilson New York University
Pseudocode	No	The paper describes methods and algorithms in prose but does not include any structured pseudocode blocks or figures labeled 'Algorithm'.
Open Source Code	Yes	Code for our experiments can be found at: https://github.com/ngruver/decon-hnn.
Open Datasets	Yes	We train NODEs and HNNs on trajectories from several Open AI Gym Mujoco environments (Brockman et al., 2016). We select synthetic environments from Finzi et al. (2020) and Finzi et al. (2021) that are derived from a time independent Hamiltonian, where energy is preserved exactly.
Dataset Splits	No	The paper specifies training and test data splits (e.g., 'The training data was 40K 3-step trajectories... The test data was 200 200-step trajectories...'), but it does not mention or describe a separate validation split for hyperparameter tuning or early stopping.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using 'Adam' optimizer and 'Euler integration rule' but does not specify version numbers for any software libraries or frameworks like PyTorch, TensorFlow, or Python.
Experiment Setup	Yes	Training: we trained each model for 256 epochs using Adam with a batch size of 200 and weight decay (λ = 1e-4). We used a cosine annealing learning rate schedule, with ηmax = 2e-4, ηmin = 1e-6. Model Architecture Each network was parameterized as a 2-layer MLP with 128 hidden units. Each model used the Euler integration rule with 8 integration steps per transition step.