Learning Intuitive Physics with Multimodal Generative Models

Authors: Sahand Rezaei-Shoshtari, Francois R. Hogan, Michael Jenkin, David Meger, Gregory Dudek6110-6118

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the ability of our approach to predict the evolution of physical configurations on simulated and real scenes. In Table 1, we present the quantitative results comparing the multimodal and unimodal models for both one-step (high temporal resolution) and resting state predictions (resting object configuration).
Researcher Affiliation Collaboration 1 Samsung AI Center Montreal 2 Mc Gill University 3 York University
Pseudocode No The paper describes its model architecture and equations but does not present any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is made publicly available 1. 1https://github.com/SAIC-MONTREAL/multimodal-dynamics
Open Datasets Yes Simulated Dataset We consider three simulated physical scenarios, as shown in Fig. 4, involving eight household object classes3 drawn from the 3D Shape Net dataset (Chang et al. 2015).
Dataset Splits Yes We present the multimodal predictions for three simulated scenarios evaluated on the validation set. ... Performance is reported as the average of the binary cross-entropy error on the validation set.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions using the 'Py Bullet environment' but does not provide specific version numbers for software dependencies or libraries used for the experiments.
Experiment Setup No The paper states, 'We downsample the sensor images to a resolution of 64 64 images and an identical network architecture and training parameters consistent across all evaluations. More details are provided in the supplemental material.', but it does not include specific hyperparameters or training settings in the main text.