reproducibilityindex.ai

RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

Authors: Sebastien Ehrhardt, Oliver Groth, Aron Monszpart, Martin Engelcke, Ingmar Posner, Niloy Mitra, Andrea Vedaldi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects. Similar to other generative approaches, RELATE is trained end-to-end on raw, unlabeled data. ... We ﬁnd that RELATE is also amenable to physically realistic scene editing and that it signiﬁcantly outperforms prior art in object-centric scene generation in both synthetic (CLEVR, Shape Stacks) and real-world data (cars). ... We demonstrate the efﬁcacy of RELATE in several scenarios, including balls rolling in bowls of variable shape [6], cluttered tabletops (CLEVR [16]), block stacking (Shape Stacks [12]), and videos of trafﬁc at busy intersection. By ablating the interaction module, we show that modeling the spatial correlation between the objects is key. Furthermore, we compare RELATE to several recent GAN- and VAE-based baselines, including Block GAN [29], GENESIS [7] and OCF [1], in terms of Fréchet Inception Distance (FID) [13], and outperform even the best state-of-the-art model by up to 29 points.
Researcher Affiliation	Collaboration	Sébastien Ehrhardt 1 Oliver Groth1 Áron Monszpart2,3 Martin Engelcke1 Ingmar Posner1 Niloy J. Mitra2,4 Andrea Vedaldi1 1Department of Engineering Science, University of Oxford 2Department of Computer Science, University College London 3Niantic, 4 Adobe Research {hyenal,ogroth}@robots.ox.ac.uk
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	Yes	Source code, datasets and more results are available at http://geometry.cs.ucl.ac.uk/projects/2020/relate/.
Open Datasets	Yes	Source code, datasets and more results are available at http://geometry.cs.ucl.ac.uk/projects/2020/relate/. We conduct experiments on four different datasets. First, we consider a relatively simple dataset, BALLSINBOWL from [6]... To this, we add two popular synthetic datasets CLEVR [16] (cluttered tabletops) and Shape Stacks [12] (block stacking). Finally, we collected a new dataset REALTRAFFIC containing ﬁve hours of footage of a busy street intersection, divided into fragments containing from one to six cars.
Dataset Splits	No	The paper mentions training on datasets and evaluating on test sets, but it does not provide specific details on training/validation/test dataset splits (exact percentages, sample counts, or explicit splitting methodology) in the main text.
Hardware Specification	No	The paper mentions using "Hartree Centre resources" and "University of Oxford Advanced Research Computing (ARC) facility" but does not provide specific details such as GPU or CPU models, processor types, or memory amounts used for experiments.
Software Dependencies	No	The paper mentions using the Adam optimizer and Ada IN architecture, but it does not provide specific version numbers for any software libraries or dependencies, such as Python, PyTorch, or CUDA.
Experiment Setup	Yes	We learn mappings Ψb and Ψf using the same Adaptive Instance Normalization (Ada IN) [14] architecture. The spatial size of their output tensors is set to H = 16 and the ﬁnal output image to 128 128 (which is reduced when needed for fair comparison to other methods). We used the Adam [18] optimizer for learning and train for a ﬁxed number of epochs and always select the last model snapshot.