Unsupervised Object-Based Transition Models For 3D Partially Observable Environments

Authors: Antonia Creswell, Rishabh Kabra, Chris Burgess, Murray Shanahan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the model on two sequential image datasets. The first is collected from a pre-trained agent moving in a simulated 3D environment, where we show that our model outperforms the current state-of-the-art object-based transition model (OP3 [32]). The second dataset is collected from a real robot arm interacting with various physical objects, where we demonstrate accurate roll-outs over significantly longer periods than used in training. An ablation study shows that the model’s success depends on the combination of correct object alignment across time and the use of a transition loss over object-level representations instead of over pixels.
Researcher Affiliation Industry Antonia Creswell London tonicreswell@deepmind.com, Rishabh Kabra London rkabra@deepmind.com, Christopher Burgess London cpburgess@deepmind.com, Murray Shanahan London mshanahan@deepmind.com
Pseudocode No The paper does not contain any sections or figures explicitly labeled "Pseudocode" or "Algorithm," nor does it present any structured, code-like blocks describing procedures.
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We train and test our model, OAT, on data collected by an agent taking actions in a Playroom environment [15, 16]. ... Here we demonstrate the application of OAT to a real world dataset [3] of robot trajectories.
Dataset Splits Yes We collect 100,000 trajectories with a 7:2:1 train:validation:test split.
Hardware Specification No The paper does not specify any particular hardware components such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions various models and architectures (e.g., MONet, Slot LSTM, transformer) and references their corresponding papers, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train OAT with four encoding steps (i.e. the model sees four observations for the first four time-steps) and six unrolling steps. MONet outputs K object representations, {zt,i}i=1,...,K. We use K = 10 objects, with F = 32 features, and a memory with M = 12 slots. ... OAT is trained end-to-end to minimise l MONet + l Align Net + l Transition model, using = 10 for all experiments presented in this paper. We use = 0.01 for all experiments presented in this paper.