reproducibilityindex.ai

Reasoning About Physical Interactions with Object-Oriented Prediction and Planning

Authors: Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For evaluation, we consider not only the accuracy of the physical predictions of the model, but also its utility for downstream tasks that require an actionable representation of intuitive physics. After training our model on an image prediction task, we can use its learned representations to build block towers more complicated than those observed during training.
Researcher Affiliation	Academia	University of California, Berkeley Massachusetts Institute of Technology
Pseudocode	Yes	Algorithm 1 Planning Procedure
Open Source Code	No	The paper refers to 'people.eecs.berkeley.edu/ janner/o2p2 for videos of the evaluation' but does not explicitly state that the source code for the methodology is provided.
Open Datasets	No	The paper states, 'In total, we collected 60,000 training images using the Mu Jo Co simulator.' and does not provide concrete access information for a publicly available or open dataset.
Dataset Splits	No	The paper mentions 'training images' and 'held-out random conﬁgurations' but does not provide specific percentages, counts, or predefined splits for training, validation, and test sets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or specific computing cluster specifications) used for running experiments.
Software Dependencies	No	The paper mentions 'Adam optimizer (Kingma & Ba, 2015)' but does not provide specific version numbers for software dependencies such as libraries, frameworks, or the Mu Jo Co simulator itself.
Experiment Setup	Yes	Objects were represented as 256-dimensional vectors. The perception module had four convolutional layers of {32, 64, 128, 256} channels, a kernel size of 4, and a stride of 2 followed by a single fully-connected layer with output size matching the object representation dimension. Both MLPs in the physics engine had two hidden layers each of size 512. The rendering networks had convolutional layers with {128, 64, 32, 3} channels (or 1 output channel in the case of the heatmap predictor), kernel sizes of {5, 5, 6, 6}, and strides of 2. We used the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 1e-3. In practice, we used CEM starting from a uniform distribution with ﬁve iterations, 1000 samples per iteration, and used the top 10% of samples to ﬁt the subsequent iteration s sampling distribution.