GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations

Authors: Martin Engelcke, Adam R. Kosiorek, Oiwi Parker Jones, Ingmar Posner

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train GENESIS on several publicly available datasets and evaluate its performance on scene generation, decomposition, and semi-supervised learning. ... We show both qualitatively and quantitatively that in contrast to prior art, GENESIS is able to generate coherent scenes while also performing well on scene decomposition.
Researcher Affiliation Academia Martin Engelcke , Adam R. Kosiorek , Oiwi Parker Jones & Ingmar Posner / Applied AI Lab, University of Oxford; Dept. of Statistics, University of Oxford
Pseudocode No The paper describes the model architecture and training process, but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code and models are available at https://github.com/applied-ai-lab/genesis.
Open Datasets Yes We conduct experiments on three canonical and publicly available datasets: coloured Multi-d Sprites (Burgess et al., 2019), the GQN dataset (Eslami et al., 2018), and Shape Stacks (Groth et al., 2018). ... Multi-d Sprites (Burgess et al., 2019) ... available at https://github.com/deepmind/dsprites-dataset. ... GQN (Eslami et al., 2018) ... It can be downloaded from https://github.com/deepmind/gqn-datasets. ... Shape Stacks (Groth et al., 2018) ... download links can be found at https://shapestacks.robots.ox.ac.uk/.
Dataset Splits Yes We set aside 10,000 for validation and testing each. (referring to Multi-d Sprites)
Hardware Specification No The paper mentions general computing resources like "University of Oxford Advanced Research Computing (ARC) facility" and "Hartree Centre resources," and states "training GENESIS takes about two days on a single GPU," but does not provide specific GPU models, CPU models, or detailed hardware specifications.
Software Dependencies No The paper mentions several software components and algorithms used (e.g., LSTM, ELUs, GECO, ADAM optimiser), but it does not specify version numbers for any of these software dependencies.
Experiment Setup Yes We use an image resolution of 64-by-64 for all experiments. The number of components is set to K = 5, K = 7, and K = 9 for Multi-d Sprites, GQN, and Shape Stacks, respectively. ... The scalar standard deviation of the Gaussian image likelihood components is set to σx = 0.7. ... The goal for the reconstruction error is set to 0.5655, multiplied by the image dimensions and number of colour channels. ... GECO hyperparameters, the default value of α = 0.99 is used and the step size for updating β is set to 10 5. ... All models are trained for 5 105 iterations with a batch size of 32 using the ADAM optimiser (Kingma & Ba, 2015) and a learning rate of 10 4.