Learning Intuitive Physics with Multimodal Generative Models
Authors: Sahand Rezaei-Shoshtari, Francois R. Hogan, Michael Jenkin, David Meger, Gregory Dudek6110-6118
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the ability of our approach to predict the evolution of physical configurations on simulated and real scenes. In Table 1, we present the quantitative results comparing the multimodal and unimodal models for both one-step (high temporal resolution) and resting state predictions (resting object configuration). |
| Researcher Affiliation | Collaboration | 1 Samsung AI Center Montreal 2 Mc Gill University 3 York University |
| Pseudocode | No | The paper describes its model architecture and equations but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is made publicly available 1. 1https://github.com/SAIC-MONTREAL/multimodal-dynamics |
| Open Datasets | Yes | Simulated Dataset We consider three simulated physical scenarios, as shown in Fig. 4, involving eight household object classes3 drawn from the 3D Shape Net dataset (Chang et al. 2015). |
| Dataset Splits | Yes | We present the multimodal predictions for three simulated scenarios evaluated on the validation set. ... Performance is reported as the average of the binary cross-entropy error on the validation set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using the 'Py Bullet environment' but does not provide specific version numbers for software dependencies or libraries used for the experiments. |
| Experiment Setup | No | The paper states, 'We downsample the sensor images to a resolution of 64 64 images and an identical network architecture and training parameters consistent across all evaluations. More details are provided in the supplemental material.', but it does not include specific hyperparameters or training settings in the main text. |