Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions

Authors: Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On videos of bouncing balls we show the superior modelling capabilities of our method compared to other unsupervised neural approaches that do not incorporate such prior knowledge. We demonstrate its ability to handle occlusion and show that it can extrapolate learned knowledge to scenes with different numbers of objects.
Researcher Affiliation Academia Sjoerd van Steenkiste Swiss AI Lab IDSIA, SUPSI, USI Lugano, Switzerland sjoerd@idsia.ch Michael Chang UC Berkeley Berkeley, United States mbchang@berkeley.edu Klaus Greff Swiss AI Lab IDSIA, SUPSI, USI Lugano, Switzerland klaus@idsia.ch Jürgen Schmidhuber Swiss AI Lab IDSIA, SUPSI, USI Lugano, Switzerland juergen@idsia.ch
Pseudocode No The paper includes mathematical formulations and descriptions of network architectures but does not present any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Code is available at https://github.com/sjoerdvansteenkiste/Relational-NEM.
Open Datasets Yes All experiments use ADAM (Kingma & Ba, 2014) with default parameters, on 50K train + 10K validation + 10K test sequences and early stopping with a patience of 10 epochs. The bouncing balls data is similar to previous work (Sutskever et al., 2009) with a few modifications. We used a pre-trained DQN to produce a dataset with sequences of 25 time-steps. The DQN receives a stack of four frames as input and we recorded every first frame of this stack. These frames were first pre-processed as in Mnih et al. (2013) and then thresholded at 0.0001 to obtain binary images.
Dataset Splits Yes All experiments use ADAM (Kingma & Ba, 2014) with default parameters, on 50K train + 10K validation + 10K test sequences and early stopping with a patience of 10 epochs.
Hardware Specification Yes We are grateful to NVIDIA Corporation for donating us a DGX-1 as part of the Pioneers of AI Research award, and to IBM for donating a Minsky machine.
Software Dependencies No The paper mentions using ADAM for optimization and general programming concepts but does not specify version numbers for any libraries, frameworks (e.g., TensorFlow, PyTorch), or programming languages used.
Experiment Setup Yes All experiments use ADAM (Kingma & Ba, 2014) with default parameters, on 50K train + 10K validation + 10K test sequences and early stopping with a patience of 10 epochs. For each of MLPenc,emb,eff we used a unique single layer neural network with 250 rectified linear units. For MLPatt we used a two-layer neural network: 100 tanh units followed by a single sigmoid unit. In all experiments we train the networks using ADAM (Kingma & Ba, 2014) with default parameters, a batch size of 64 and 50 000 train + 10 000 validation + 10 000 test inputs. We use early stopping when the validation loss has not improved for 10 epochs.