GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement

Authors: Martin Engelcke, Oiwi Parker Jones, Ingmar Posner

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that GENESIS-V2 performs strongly in comparison to recent baselines in terms of unsupervised image segmentation and object-centric scene generation on established synthetic datasets as well as more complex real-world datasets.
Researcher Affiliation Collaboration Martin Engelcke , Oiwi Parker Jones, and Ingmar Posner Applied AI Lab, University of Oxford, UK {martin, oiwi, ingmar}@robots.ox.ac.uk Now affiliated with Google Deep Mind.
Pseudocode Yes Algorithm 1: Instance Colouring Stick-Breaking Process
Open Source Code Yes Code and pre-trained models are available at https://github.com/applied-ai-lab/genesis.
Open Datasets Yes GENESIS-V2 is comprehensively benchmarked against recent prior art [16, 17, 24] on established synthetic datasets Objects Room [44] and Shape Stacks [45] where it performs strongly in comparison to several recent baselines. We also evaluate GENESIS-V2 on more challenging real-world images from the Sketchy [46] and the MIT-Princeton Amazon Picking Challenge (APC) 2016 Object Segmentation datasets [47]
Dataset Splits No The paper mentions evaluating on 'test sets' of 320 images, but does not provide explicit details about the train/validation/test splits (e.g., percentages, absolute counts for each split, or references to predefined splits with citations for reproducibility). While it states 'Further training details are described in Appendix E', this appendix is not provided in the current context.
Hardware Specification No The paper mentions 'GPU accelerators' but does not specify the exact GPU models, CPU models, memory, or any other detailed hardware specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software names along with their version numbers (e.g., 'Python 3.8', 'PyTorch 1.9', 'CUDA 11.1') that would be required to reproduce the experimental environment.
Experiment Setup Yes Following Engelcke et al. [17], GENESIS-V2 is trained by minimising the GECO objective [56], which can be written as a loss function of the form Lg = Eqφ(z|x)[ ln pθ(x | z)] + βg KL[qφ(z | x) || pθ(z)] . (5) The relative weighting factor βg R+ is updated at every training iteration separately from the model parameters according to βg = βg eη(C E) with E = αg E + (1 αg) Eqφ(z|x)[ ln pθ(x | z)] . (6) E R is an exponential moving average of the negative image log-likelihood, αg [0, 1] is a momentum factor, η R+ is a step size hyperparameter, and C R is a target reconstruction error.