reproducibilityindex.ai

Visual Interaction Networks: Learning a Physics Simulator from Video

Authors: Nicholas Watters, Daniel Zoran, Theophane Weber, Peter Battaglia, Razvan Pascanu, Andrea Tacchetti

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compared the VIN to a suite of baseline and competitor models, including ablation experiments. For each system we generated a dataset with 3 objects and a dataset with 6 objects. Each dataset had a training set of 2.5 105 simulations and a test set of 2.5 104 simulations, with each simulation 64 frames long. Our results show that the VIN predicts dynamics accurately, outperforming baselines on all datasets (see Figures 3 and 4).
Researcher Affiliation	Industry	Nicholas Watters, Andrea Tacchetti, Théophane Weber Razvan Pascanu, Peter Battaglia, Daniel Zoran Deep Mind London, United Kingdom {nwatters, atacchet, theophane, razp, peterbattaglia, danielzoran}@google.com
Pseudocode	No	The paper describes the architecture of the Visual Interaction Network using text and diagrams, but does not include pseudocode or algorithm blocks.
Open Source Code	No	The paper encourages the reader to view the videos at (https://goo.gl/Rj E3ey), but does not provide a link to the source code for the methodology.
Open Datasets	Yes	We rendered the system state on top of a CIFAR-10 natural image background. We rendered natural image backgrounds online from separate training and testing CIFAR-10 sets.
Dataset Splits	Yes	For each system we generated a dataset with 3 objects and a dataset with 6 objects. Each dataset had a training set of 2.5 105 simulations and a test set of 2.5 104 simulations, with each simulation 64 frames long.
Hardware Specification	No	The paper does not specify the hardware (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies	No	The model was trained by backpropagation with an Adam optimizer [15]. However, no specific version numbers for Adam or any other software libraries or frameworks (e.g., Python, TensorFlow, PyTorch) are provided.
Experiment Setup	Yes	We trained the model to predict a sequence of 8 consecutive unseen future states from 6 frames of input video. Our prediction loss was a normalized weighted sum of the corresponding 8 error terms. The model was trained by backpropagation with an Adam optimizer [15]. See the Supplementary Material for full training parameters.