SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision

Authors: Irina Higgins, Peter Wirnsberger, Andrew Jaegle, Aleksandar Botev

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we empirically highlight the problems with the existing measures and develop a set of new measures, including a binary indicator of whether the underlying Hamiltonian dynamics have been faithfully captured, which we call Symplecticity Metric or Sy Metric. Using Sy Metric, we identify a set of architectural choices that significantly improve the performance of a previously proposed model for inferring latent dynamics from pixels, the Hamiltonian Generative Network (HGN).
Researcher Affiliation Industry Irina Higgins Deep Mind London irinah@deepmind.com Peter Wirnsberger Deep Mind London pewi@deepmind.com Andrew Jaegle Deep Mind London drewjaegle@deepmind.com Aleksandar Botev Deep Mind London botev@deepmind.com
Pseudocode Yes In our experiments we use κ=5 (see Alg. 1 in Appendix for more details).
Open Source Code Yes The code for reproducing all results is available on https://github.com/deepmind/ deepmind-research/tree/master/physics_inspired_models.
Open Datasets Yes We compare the performance of models on 13 datasets instantiating different types of Hamiltonian dynamics introduced in Botev et al [7]... [28] Multi-object datasets. https://github.com/deepmind/multiobject-datasets/, 2019.
Dataset Splits No The paper mentions 'test data' and 'training' data with specified trajectory lengths (e.g., 60 timesteps for training), but it does not provide explicit dataset splits (e.g., percentages or counts for train/validation/test sets) for the overall dataset used in experiments.
Hardware Specification Yes All models were trained on single NVIDIA Tesla V100 GPU devices with 32GB of memory. Depending on the model, training lasted between 2 and 10 days.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., specific Python or deep learning framework versions like PyTorch 1.9).
Experiment Setup Yes We evaluated a range of learning rates, activation functions, kernel sizes for the convolutional layers, hyperparameter settings for the GECO solution for optimising the variational objective [42] and the inference and reconstruction protocols used for training the model. We also considered larger architectural changes, like whether to use the spatial broadcast decoder [52]; whether to use separate networks for inferring the position and momenta coordinates in the encoder; whether to explicitly encourage the rolled out state at time t+N to be similar to the inferred state at time t+N as in [3]; whether to infer the phase space directly or project it through another neural network as in the original HGN work; whether to train the model to predict rollouts forward in time, or both forward and backward in time as in [24]; and finally whether to use a 2D (convolutional) or a 1D (vector) phase-space and corresponding Hamiltonian. We found that a combination of 3x3 kernel sizes and leaky ReLU [33] activations in the encoder and decoder, Swish activations [39] in the Hamiltonian network, as well as a 1D phase-space inferred directly from images used in combination with a Hamiltonian parametrised by an MLP network significantly helped to improve the performance of the model. Furthermore, changing the GECO-based training objective to a β-VAE-based [22] one, training the network for prediction rather than reconstruction and training it explicitly to produce rollouts both forward and backward in time further improved its performance.