Learning to See Physics via Visual De-animation
Authors: Jiajun Wu, Erika Lu, Pushmeet Kohli, Bill Freeman, Josh Tenenbaum
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our system on both synthetic and real datasets involving multiple physical scenes, and demonstrate that our system performs well on both physical state estimation and reasoning problems. We further show that the knowledge learned on the synthetic dataset generalizes to constrained real images. |
| Researcher Affiliation | Collaboration | Jiajun Wu MIT CSAIL Erika Lu University of Oxford Pushmeet Kohli Deep Mind William T. Freeman MIT CSAIL, Google Research Joshua B. Tenenbaum MIT CSAIL |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | No explicit statement about providing open-source code for the methodology described in this paper was found. The paper mentions using 'the released code from Fragkiadaki et al. [2016]' for data generation, but not for their own framework. |
| Open Datasets | Yes | For the billiard table scenario, we generate data using the released code from Fragkiadaki et al. [2016]. Data Lerer et al. [2016] built a dataset of 492 images of real block towers, with ground truth stability values. |
| Dataset Splits | No | No explicit mention of a separate validation dataset split was found. For the billiard table scenario, it states '9,000 videos for training and 200 for testing'. For block towers, training images are mentioned without a validation split. |
| Hardware Specification | No | No specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments were mentioned. |
| Software Dependencies | No | No specific version numbers for software dependencies were provided. The paper mentions 'Torch7 [Collobert et al., 2011]' and 'Bullet [Coumans, 2010]', but without explicit version numbers. |
| Experiment Setup | Yes | We train our framework using SGD, with a learning rate of 0.001 and a momentum of 0.9. We use multi-sample REINFORCE [Rezende et al., 2016, Mnih and Rezende, 2016] with 16 samples per input, assuming each position parameter is from a Gaussian distribution and each rotation parameter is from a multinomial distribution (quantized into 20 bins). |