Visual Interaction Networks: Learning a Physics Simulator from Video
Authors: Nicholas Watters, Daniel Zoran, Theophane Weber, Peter Battaglia, Razvan Pascanu, Andrea Tacchetti
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compared the VIN to a suite of baseline and competitor models, including ablation experiments. For each system we generated a dataset with 3 objects and a dataset with 6 objects. Each dataset had a training set of 2.5 105 simulations and a test set of 2.5 104 simulations, with each simulation 64 frames long. Our results show that the VIN predicts dynamics accurately, outperforming baselines on all datasets (see Figures 3 and 4). |
| Researcher Affiliation | Industry | Nicholas Watters, Andrea Tacchetti, Théophane Weber Razvan Pascanu, Peter Battaglia, Daniel Zoran Deep Mind London, United Kingdom {nwatters, atacchet, theophane, razp, peterbattaglia, danielzoran}@google.com |
| Pseudocode | No | The paper describes the architecture of the Visual Interaction Network using text and diagrams, but does not include pseudocode or algorithm blocks. |
| Open Source Code | No | The paper encourages the reader to view the videos at (https://goo.gl/Rj E3ey), but does not provide a link to the source code for the methodology. |
| Open Datasets | Yes | We rendered the system state on top of a CIFAR-10 natural image background. We rendered natural image backgrounds online from separate training and testing CIFAR-10 sets. |
| Dataset Splits | Yes | For each system we generated a dataset with 3 objects and a dataset with 6 objects. Each dataset had a training set of 2.5 105 simulations and a test set of 2.5 104 simulations, with each simulation 64 frames long. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The model was trained by backpropagation with an Adam optimizer [15]. However, no specific version numbers for Adam or any other software libraries or frameworks (e.g., Python, TensorFlow, PyTorch) are provided. |
| Experiment Setup | Yes | We trained the model to predict a sequence of 8 consecutive unseen future states from 6 frames of input video. Our prediction loss was a normalized weighted sum of the corresponding 8 error terms. The model was trained by backpropagation with an Adam optimizer [15]. See the Supplementary Material for full training parameters. |